All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
@ 2021-02-10 16:40 Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 01/21] i386: keep hyperv_vendor string up-to-date Vitaly Kuznetsov
                   ` (21 more replies)
  0 siblings, 22 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Changes since v3:
- Make 'hv-default' override 'hv-*' options which were already set 
  (e.g. 'hv-feature=on,hv-default' case) [Igor]. Make 'hv-passthrough'
  behave the same way.
- Add "i386: be more picky about implicit 'hv-evmcs' enablement" patch to avoid
  enabling 'hv-evmcs' with hv-default/hv-passthrough when guest CPU lacks VMX.
- Add "i386: support 'hv-passthrough,hv-feature=off' on the command line" patch
  to make 'hv-passthrough' semantics match the newly introduced 'hv-default'.
- Add "i386: track explicit 'hv-*' features enablement/disablement" patch to
  support the above mentioned changes.
- Expand qtest to check the above mentioned improvements.

Original description:

Upper layer tools like libvirt want to figure out which Hyper-V features are
supported by the underlying stack (QEMU/KVM) but currently they are unable to
do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
no effect on e.g. QMP's 

query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}

command as we parse Hyper-V features after creating KVM vCPUs and not at
feature expansion time. To support the use-case we first need to make 
KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
vCPU version can't be used that early. This is what KVM part does. With
that done, we can make early Hyper-V feature expansion (this series).

In addition, provide a simple 'hv-default' option which enables (and
requires from KVM) all currently supported Hyper-V enlightenments.
Unlike 'hv-passthrough' mode, this is going to be migratable.

Vitaly Kuznetsov (21):
  i386: keep hyperv_vendor string up-to-date
  i386: invert hyperv_spinlock_attempts setting logic with
    hv_passthrough
  i386: always fill Hyper-V CPUID feature leaves from X86CPU data
  i386: stop using env->features[] for filling Hyper-V CPUIDs
  i386: introduce hyperv_feature_supported()
  i386: introduce hv_cpuid_get_host()
  i386: drop FEAT_HYPERV feature leaves
  i386: introduce hv_cpuid_cache
  i386: split hyperv_handle_properties() into
    hyperv_expand_features()/hyperv_fill_cpuids()
  i386: move eVMCS enablement to hyperv_init_vcpu()
  i386: switch hyperv_expand_features() to using error_setg()
  i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size
  i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one
  i386: use global kvm_state in hyperv_enabled() check
  i386: expand Hyper-V features during CPU feature expansion time
  i386: track explicit 'hv-*' features enablement/disablement
  i386: support 'hv-passthrough,hv-feature=off' on the command line
  i386: be more picky about implicit 'hv-evmcs' enablement
  i386: introduce kvm_hv_evmcs_available()
  i386: provide simple 'hv-default=on' option
  qtest/hyperv: Introduce a simple hyper-v test

 MAINTAINERS                |   1 +
 docs/hyperv.txt            |  16 +-
 target/i386/cpu.c          | 430 ++++++++++++++++++++---------
 target/i386/cpu.h          |  11 +-
 target/i386/kvm/kvm-stub.c |  10 +
 target/i386/kvm/kvm.c      | 535 +++++++++++++++++++++----------------
 target/i386/kvm/kvm_i386.h |   2 +
 tests/qtest/hyperv-test.c  | 312 +++++++++++++++++++++
 tests/qtest/meson.build    |   3 +-
 9 files changed, 950 insertions(+), 370 deletions(-)
 create mode 100644 tests/qtest/hyperv-test.c

-- 
2.29.2



^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v4 01/21] i386: keep hyperv_vendor string up-to-date
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 02/21] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough Vitaly Kuznetsov
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

When cpu->hyperv_vendor is not set manually we default to "Microsoft Hv"
and in 'hv_passthrough' mode we get the information from the host. This
information is stored in cpu->hyperv_vendor_id[] array but we don't update
cpu->hyperv_vendor string so e.g. QMP's query-cpu-model-expansion output
is incorrect.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c     | 19 +++++++++----------
 target/i386/kvm/kvm.c |  4 ++++
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9c3d2d60b7e5..d03c1588ba0e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6547,17 +6547,16 @@ static void x86_cpu_hyperv_realize(X86CPU *cpu)
 
     /* Hyper-V vendor id */
     if (!cpu->hyperv_vendor) {
-        memcpy(cpu->hyperv_vendor_id, "Microsoft Hv", 12);
-    } else {
-        len = strlen(cpu->hyperv_vendor);
-
-        if (len > 12) {
-            warn_report("hv-vendor-id truncated to 12 characters");
-            len = 12;
-        }
-        memset(cpu->hyperv_vendor_id, 0, 12);
-        memcpy(cpu->hyperv_vendor_id, cpu->hyperv_vendor, len);
+        object_property_set_str(OBJECT(cpu), "hv-vendor-id", "Microsoft Hv",
+                                &error_abort);
+    }
+    len = strlen(cpu->hyperv_vendor);
+    if (len > 12) {
+        warn_report("hv-vendor-id truncated to 12 characters");
+        len = 12;
     }
+    memset(cpu->hyperv_vendor_id, 0, 12);
+    memcpy(cpu->hyperv_vendor_id, cpu->hyperv_vendor, len);
 
     /* 'Hv#1' interface identification*/
     cpu->hyperv_interface_id[0] = 0x31237648;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e97f84175707..6ae4be44aa6f 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1214,6 +1214,10 @@ static int hyperv_handle_properties(CPUState *cs,
             cpu->hyperv_vendor_id[0] = c->ebx;
             cpu->hyperv_vendor_id[1] = c->ecx;
             cpu->hyperv_vendor_id[2] = c->edx;
+            cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
+                                           sizeof(cpu->hyperv_vendor_id) + 1);
+            memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
+                   sizeof(cpu->hyperv_vendor_id));
         }
 
         c = cpuid_find_entry(cpuid, HV_CPUID_INTERFACE, 0);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 02/21] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 01/21] i386: keep hyperv_vendor string up-to-date Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 03/21] i386: always fill Hyper-V CPUID feature leaves from X86CPU data Vitaly Kuznetsov
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

There is no need to have this special case: like all other Hyper-V
enlightenments we can just use kernel's supplied value in hv_passthrough
mode.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 6ae4be44aa6f..211efbd13b49 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1254,11 +1254,7 @@ static int hyperv_handle_properties(CPUState *cs,
         c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
         if (c) {
             env->features[FEAT_HV_RECOMM_EAX] = c->eax;
-
-            /* hv-spinlocks may have been overriden */
-            if (cpu->hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_NOTIFY) {
-                c->ebx = cpu->hyperv_spinlock_attempts;
-            }
+            cpu->hyperv_spinlock_attempts = c->ebx;
         }
         c = cpuid_find_entry(cpuid, HV_CPUID_NESTED_FEATURES, 0);
         if (c) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 03/21] i386: always fill Hyper-V CPUID feature leaves from X86CPU data
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 01/21] i386: keep hyperv_vendor string up-to-date Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 02/21] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 04/21] i386: stop using env->features[] for filling Hyper-V CPUIDs Vitaly Kuznetsov
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

We have all the required data in X86CPU already and as we are about to
split hyperv_handle_properties() into hyperv_expand_features()/
hyperv_fill_cpuids() we can remove the blind copy. The functional change
is that QEMU won't pass CPUID leaves it doesn't currently know about
to the guest but arguably this is a good change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 211efbd13b49..ba285a364792 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1206,9 +1206,6 @@ static int hyperv_handle_properties(CPUState *cs,
     }
 
     if (cpu->hyperv_passthrough) {
-        memcpy(cpuid_ent, &cpuid->entries[0],
-               cpuid->nent * sizeof(cpuid->entries[0]));
-
         c = cpuid_find_entry(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, 0);
         if (c) {
             cpu->hyperv_vendor_id[0] = c->ebx;
@@ -1307,12 +1304,6 @@ static int hyperv_handle_properties(CPUState *cs,
         goto free;
     }
 
-    if (cpu->hyperv_passthrough) {
-        /* We already copied all feature words from KVM as is */
-        r = cpuid->nent;
-        goto free;
-    }
-
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
     c->eax = hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ?
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 04/21] i386: stop using env->features[] for filling Hyper-V CPUIDs
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 03/21] i386: always fill Hyper-V CPUID feature leaves from X86CPU data Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 05/21] i386: introduce hyperv_feature_supported() Vitaly Kuznetsov
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

As a preparatory patch to dropping Hyper-V CPUID leaves from
feature_word_info[] stop using env->features[] as a temporary
storage of Hyper-V CPUIDs, just build Hyper-V CPUID leaves directly
from kvm_hyperv_properties[] data.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.h     |  1 +
 target/i386/kvm/kvm.c | 80 +++++++++++++++++++++++--------------------
 2 files changed, 43 insertions(+), 38 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8d599bb5b8f7..21a1758b4b1a 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1678,6 +1678,7 @@ struct X86CPU {
     uint32_t hyperv_interface_id[4];
     uint32_t hyperv_version_id[4];
     uint32_t hyperv_limits[3];
+    uint32_t hyperv_nested[4];
 
     bool check_cpuid;
     bool enforce_cpuid;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ba285a364792..2734b01c95e1 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1110,7 +1110,6 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
                                   int feature)
 {
     X86CPU *cpu = X86_CPU(cs);
-    CPUX86State *env = &cpu->env;
     uint32_t r, fw, bits;
     uint64_t deps;
     int i, dep_feat;
@@ -1150,8 +1149,6 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
                 return 0;
             }
         }
-
-        env->features[fw] |= bits;
     }
 
     if (cpu->hyperv_passthrough) {
@@ -1161,6 +1158,29 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
     return 0;
 }
 
+static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t fw)
+{
+    X86CPU *cpu = X86_CPU(cs);
+    uint32_t r = 0;
+    int i, j;
+
+    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties); i++) {
+        if (!hyperv_feat_enabled(cpu, i)) {
+            continue;
+        }
+
+        for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
+            if (kvm_hyperv_properties[i].flags[j].fw != fw) {
+                continue;
+            }
+
+            r |= kvm_hyperv_properties[i].flags[j].bits;
+        }
+    }
+
+    return r;
+}
+
 /*
  * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent in
  * case of success, errno < 0 in case of failure and 0 when no Hyper-V
@@ -1170,9 +1190,8 @@ static int hyperv_handle_properties(CPUState *cs,
                                     struct kvm_cpuid_entry2 *cpuid_ent)
 {
     X86CPU *cpu = X86_CPU(cs);
-    CPUX86State *env = &cpu->env;
     struct kvm_cpuid2 *cpuid;
-    struct kvm_cpuid_entry2 *c;
+    struct kvm_cpuid_entry2 *c, *c2;
     uint32_t cpuid_i = 0;
     int r;
 
@@ -1193,9 +1212,7 @@ static int hyperv_handle_properties(CPUState *cs,
         }
 
         if (!r) {
-            env->features[FEAT_HV_RECOMM_EAX] |=
-                HV_ENLIGHTENED_VMCS_RECOMMENDED;
-            env->features[FEAT_HV_NESTED_EAX] = evmcs_version;
+            cpu->hyperv_nested[0] = evmcs_version;
         }
     }
 
@@ -1233,13 +1250,6 @@ static int hyperv_handle_properties(CPUState *cs,
             cpu->hyperv_version_id[3] = c->edx;
         }
 
-        c = cpuid_find_entry(cpuid, HV_CPUID_FEATURES, 0);
-        if (c) {
-            env->features[FEAT_HYPERV_EAX] = c->eax;
-            env->features[FEAT_HYPERV_EBX] = c->ebx;
-            env->features[FEAT_HYPERV_EDX] = c->edx;
-        }
-
         c = cpuid_find_entry(cpuid, HV_CPUID_IMPLEMENT_LIMITS, 0);
         if (c) {
             cpu->hv_max_vps = c->eax;
@@ -1250,23 +1260,8 @@ static int hyperv_handle_properties(CPUState *cs,
 
         c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
         if (c) {
-            env->features[FEAT_HV_RECOMM_EAX] = c->eax;
             cpu->hyperv_spinlock_attempts = c->ebx;
         }
-        c = cpuid_find_entry(cpuid, HV_CPUID_NESTED_FEATURES, 0);
-        if (c) {
-            env->features[FEAT_HV_NESTED_EAX] = c->eax;
-        }
-    }
-
-    if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
-        env->features[FEAT_HV_RECOMM_EAX] |= HV_NO_NONARCH_CORESHARING;
-    } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
-        c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
-        if (c) {
-            env->features[FEAT_HV_RECOMM_EAX] |=
-                c->eax & HV_NO_NONARCH_CORESHARING;
-        }
     }
 
     /* Features */
@@ -1296,9 +1291,6 @@ static int hyperv_handle_properties(CPUState *cs,
         r |= 1;
     }
 
-    /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
-    env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
-
     if (r) {
         r = -ENOSYS;
         goto free;
@@ -1328,15 +1320,27 @@ static int hyperv_handle_properties(CPUState *cs,
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_FEATURES;
-    c->eax = env->features[FEAT_HYPERV_EAX];
-    c->ebx = env->features[FEAT_HYPERV_EBX];
-    c->edx = env->features[FEAT_HYPERV_EDX];
+    c->eax = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EAX);
+    c->ebx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EBX);
+    c->edx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EDX);
+
+    /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
+    c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_ENLIGHTMENT_INFO;
-    c->eax = env->features[FEAT_HV_RECOMM_EAX];
+    c->eax = hv_build_cpuid_leaf(cs, FEAT_HV_RECOMM_EAX);
     c->ebx = cpu->hyperv_spinlock_attempts;
 
+    if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
+        c->eax |= HV_NO_NONARCH_CORESHARING;
+    } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
+        c2 = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
+        if (c2) {
+            c->eax |= c2->eax & HV_NO_NONARCH_CORESHARING;
+        }
+    }
+
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_IMPLEMENT_LIMITS;
     c->eax = cpu->hv_max_vps;
@@ -1356,7 +1360,7 @@ static int hyperv_handle_properties(CPUState *cs,
 
         c = &cpuid_ent[cpuid_i++];
         c->function = HV_CPUID_NESTED_FEATURES;
-        c->eax = env->features[FEAT_HV_NESTED_EAX];
+        c->eax = cpu->hyperv_nested[0];
     }
     r = cpuid_i;
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 05/21] i386: introduce hyperv_feature_supported()
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 04/21] i386: stop using env->features[] for filling Hyper-V CPUIDs Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 06/21] i386: introduce hv_cpuid_get_host() Vitaly Kuznetsov
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Clean up hv_cpuid_check_and_set() by separating hyperv_feature_supported()
off it. No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 49 ++++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 2734b01c95e1..dd80b46bc604 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1106,13 +1106,33 @@ static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
     return 0;
 }
 
+static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
+{
+    uint32_t r, fw, bits;
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
+        fw = kvm_hyperv_properties[feature].flags[i].fw;
+        bits = kvm_hyperv_properties[feature].flags[i].bits;
+
+        if (!fw) {
+            continue;
+        }
+
+        if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
 static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
                                   int feature)
 {
     X86CPU *cpu = X86_CPU(cs);
-    uint32_t r, fw, bits;
     uint64_t deps;
-    int i, dep_feat;
+    int dep_feat;
 
     if (!hyperv_feat_enabled(cpu, feature) && !cpu->hyperv_passthrough) {
         return 0;
@@ -1131,23 +1151,14 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
         deps &= ~(1ull << dep_feat);
     }
 
-    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
-        fw = kvm_hyperv_properties[feature].flags[i].fw;
-        bits = kvm_hyperv_properties[feature].flags[i].bits;
-
-        if (!fw) {
-            continue;
-        }
-
-        if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
-            if (hyperv_feat_enabled(cpu, feature)) {
-                fprintf(stderr,
-                        "Hyper-V %s is not supported by kernel\n",
-                        kvm_hyperv_properties[feature].desc);
-                return 1;
-            } else {
-                return 0;
-            }
+    if (!hyperv_feature_supported(cpuid, feature)) {
+        if (hyperv_feat_enabled(cpu, feature)) {
+            fprintf(stderr,
+                    "Hyper-V %s is not supported by kernel\n",
+                    kvm_hyperv_properties[feature].desc);
+            return 1;
+        } else {
+            return 0;
         }
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 06/21] i386: introduce hv_cpuid_get_host()
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 05/21] i386: introduce hyperv_feature_supported() Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 07/21] i386: drop FEAT_HYPERV feature leaves Vitaly Kuznetsov
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

As a preparation to implementing hv_cpuid_cache intro introduce
hv_cpuid_get_host(). No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 100 +++++++++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 44 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index dd80b46bc604..762c4f893467 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1106,6 +1106,19 @@ static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
     return 0;
 }
 
+static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
+                                  int reg)
+{
+    struct kvm_cpuid_entry2 *entry;
+
+    entry = cpuid_find_entry(cpuid, func, 0);
+    if (!entry) {
+        return 0;
+    }
+
+    return cpuid_entry_get_reg(entry, reg);
+}
+
 static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
 {
     uint32_t r, fw, bits;
@@ -1202,7 +1215,7 @@ static int hyperv_handle_properties(CPUState *cs,
 {
     X86CPU *cpu = X86_CPU(cs);
     struct kvm_cpuid2 *cpuid;
-    struct kvm_cpuid_entry2 *c, *c2;
+    struct kvm_cpuid_entry2 *c;
     uint32_t cpuid_i = 0;
     int r;
 
@@ -1234,45 +1247,46 @@ static int hyperv_handle_properties(CPUState *cs,
     }
 
     if (cpu->hyperv_passthrough) {
-        c = cpuid_find_entry(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, 0);
-        if (c) {
-            cpu->hyperv_vendor_id[0] = c->ebx;
-            cpu->hyperv_vendor_id[1] = c->ecx;
-            cpu->hyperv_vendor_id[2] = c->edx;
-            cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
-                                           sizeof(cpu->hyperv_vendor_id) + 1);
-            memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
-                   sizeof(cpu->hyperv_vendor_id));
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_INTERFACE, 0);
-        if (c) {
-            cpu->hyperv_interface_id[0] = c->eax;
-            cpu->hyperv_interface_id[1] = c->ebx;
-            cpu->hyperv_interface_id[2] = c->ecx;
-            cpu->hyperv_interface_id[3] = c->edx;
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_VERSION, 0);
-        if (c) {
-            cpu->hyperv_version_id[0] = c->eax;
-            cpu->hyperv_version_id[1] = c->ebx;
-            cpu->hyperv_version_id[2] = c->ecx;
-            cpu->hyperv_version_id[3] = c->edx;
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_IMPLEMENT_LIMITS, 0);
-        if (c) {
-            cpu->hv_max_vps = c->eax;
-            cpu->hyperv_limits[0] = c->ebx;
-            cpu->hyperv_limits[1] = c->ecx;
-            cpu->hyperv_limits[2] = c->edx;
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
-        if (c) {
-            cpu->hyperv_spinlock_attempts = c->ebx;
-        }
+        cpu->hyperv_vendor_id[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
+        cpu->hyperv_vendor_id[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_ECX);
+        cpu->hyperv_vendor_id[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EDX);
+        cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
+                                       sizeof(cpu->hyperv_vendor_id) + 1);
+        memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
+               sizeof(cpu->hyperv_vendor_id));
+
+        cpu->hyperv_interface_id[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EAX);
+        cpu->hyperv_interface_id[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EBX);
+        cpu->hyperv_interface_id[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_ECX);
+        cpu->hyperv_interface_id[3] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EDX);
+
+        cpu->hyperv_version_id[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EAX);
+        cpu->hyperv_version_id[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EBX);
+        cpu->hyperv_version_id[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_ECX);
+        cpu->hyperv_version_id[3] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EDX);
+
+        cpu->hv_max_vps = hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS,
+                                            R_EAX);
+        cpu->hyperv_limits[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EBX);
+        cpu->hyperv_limits[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_ECX);
+        cpu->hyperv_limits[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EDX);
+
+        cpu->hyperv_spinlock_attempts =
+            hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EBX);
     }
 
     /* Features */
@@ -1346,10 +1360,8 @@ static int hyperv_handle_properties(CPUState *cs,
     if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
         c->eax |= HV_NO_NONARCH_CORESHARING;
     } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
-        c2 = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
-        if (c2) {
-            c->eax |= c2->eax & HV_NO_NONARCH_CORESHARING;
-        }
+        c->eax |= hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EAX) &
+            HV_NO_NONARCH_CORESHARING;
     }
 
     c = &cpuid_ent[cpuid_i++];
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 07/21] i386: drop FEAT_HYPERV feature leaves
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 06/21] i386: introduce hv_cpuid_get_host() Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 08/21] i386: introduce hv_cpuid_cache Vitaly Kuznetsov
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Hyper-V feature leaves are weird. We have some of them in
feature_word_info[] array but we don't use feature_word_info
magic to enable them. Neither do we use feature_dependencies[]
mechanism to validate the configuration as it doesn't allign
well with Hyper-V's many-to-many dependency chains. Some of
the feature leaves hold not only feature bits, but also values.
E.g. FEAT_HV_NESTED_EAX contains both features and the supported
Enlightened VMCS range.

Hyper-V features are already represented in 'struct X86CPU' with
uint64_t hyperv_features so duplicating them in env->features adds
little (or zero) benefits. THe other half of Hyper-V emulation features
is also stored with values in hyperv_vendor_id[], hyperv_limits[],...
so env->features[] is already incomplete.

Remove Hyper-V feature leaves from env->features[] completely.
kvm_hyperv_properties[] is converted to using raw CPUID func/reg
pairs for features, this allows us to get rid of hv_cpuid_get_fw()
conversion.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c     |  90 +----------------------------------
 target/i386/cpu.h     |   5 --
 target/i386/kvm/kvm.c | 108 ++++++++++++++----------------------------
 3 files changed, 37 insertions(+), 166 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d03c1588ba0e..f0f826997ba0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -832,94 +832,6 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
          */
         .no_autoenable_flags = ~0U,
     },
-    /*
-     * .feat_names are commented out for Hyper-V enlightenments because we
-     * don't want to have two different ways for enabling them on QEMU command
-     * line. Some features (e.g. "hyperv_time", "hyperv_vapic", ...) require
-     * enabling several feature bits simultaneously, exposing these bits
-     * individually may just confuse guests.
-     */
-    [FEAT_HYPERV_EAX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_msr_vp_runtime_access */, NULL /* hv_msr_time_refcount_access */,
-            NULL /* hv_msr_synic_access */, NULL /* hv_msr_stimer_access */,
-            NULL /* hv_msr_apic_access */, NULL /* hv_msr_hypercall_access */,
-            NULL /* hv_vpindex_access */, NULL /* hv_msr_reset_access */,
-            NULL /* hv_msr_stats_access */, NULL /* hv_reftsc_access */,
-            NULL /* hv_msr_idle_access */, NULL /* hv_msr_frequency_access */,
-            NULL /* hv_msr_debug_access */, NULL /* hv_msr_reenlightenment_access */,
-            NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000003, .reg = R_EAX, },
-    },
-    [FEAT_HYPERV_EBX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_create_partitions */, NULL /* hv_access_partition_id */,
-            NULL /* hv_access_memory_pool */, NULL /* hv_adjust_message_buffers */,
-            NULL /* hv_post_messages */, NULL /* hv_signal_events */,
-            NULL /* hv_create_port */, NULL /* hv_connect_port */,
-            NULL /* hv_access_stats */, NULL, NULL, NULL /* hv_debugging */,
-            NULL /* hv_cpu_power_management */, NULL /* hv_configure_profiler */,
-            NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000003, .reg = R_EBX, },
-    },
-    [FEAT_HYPERV_EDX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_mwait */, NULL /* hv_guest_debugging */,
-            NULL /* hv_perf_monitor */, NULL /* hv_cpu_dynamic_part */,
-            NULL /* hv_hypercall_params_xmm */, NULL /* hv_guest_idle_state */,
-            NULL, NULL,
-            NULL, NULL, NULL /* hv_guest_crash_msr */, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000003, .reg = R_EDX, },
-    },
-    [FEAT_HV_RECOMM_EAX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_recommend_pv_as_switch */,
-            NULL /* hv_recommend_pv_tlbflush_local */,
-            NULL /* hv_recommend_pv_tlbflush_remote */,
-            NULL /* hv_recommend_msr_apic_access */,
-            NULL /* hv_recommend_msr_reset */,
-            NULL /* hv_recommend_relaxed_timing */,
-            NULL /* hv_recommend_dma_remapping */,
-            NULL /* hv_recommend_int_remapping */,
-            NULL /* hv_recommend_x2apic_msrs */,
-            NULL /* hv_recommend_autoeoi_deprecation */,
-            NULL /* hv_recommend_pv_ipi */,
-            NULL /* hv_recommend_ex_hypercalls */,
-            NULL /* hv_hypervisor_is_nested */,
-            NULL /* hv_recommend_int_mbec */,
-            NULL /* hv_recommend_evmcs */,
-            NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000004, .reg = R_EAX, },
-    },
-    [FEAT_HV_NESTED_EAX] = {
-        .type = CPUID_FEATURE_WORD,
-        .cpuid = { .eax = 0x4000000A, .reg = R_EAX, },
-    },
     [FEAT_SVM] = {
         .type = CPUID_FEATURE_WORD,
         .feat_names = {
@@ -6951,7 +6863,7 @@ static GuestPanicInformation *x86_cpu_get_crash_info(CPUState *cs)
     CPUX86State *env = &cpu->env;
     GuestPanicInformation *panic_info = NULL;
 
-    if (env->features[FEAT_HYPERV_EDX] & HV_GUEST_CRASH_MSR_AVAILABLE) {
+    if (hyperv_feat_enabled(cpu, HYPERV_FEAT_CRASH)) {
         panic_info = g_malloc0(sizeof(GuestPanicInformation));
 
         panic_info->type = GUEST_PANIC_INFORMATION_TYPE_HYPER_V;
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 21a1758b4b1a..7ea14822aab5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -518,11 +518,6 @@ typedef enum FeatureWord {
     FEAT_C000_0001_EDX, /* CPUID[C000_0001].EDX */
     FEAT_KVM,           /* CPUID[4000_0001].EAX (KVM_CPUID_FEATURES) */
     FEAT_KVM_HINTS,     /* CPUID[4000_0001].EDX */
-    FEAT_HYPERV_EAX,    /* CPUID[4000_0003].EAX */
-    FEAT_HYPERV_EBX,    /* CPUID[4000_0003].EBX */
-    FEAT_HYPERV_EDX,    /* CPUID[4000_0003].EDX */
-    FEAT_HV_RECOMM_EAX, /* CPUID[4000_0004].EAX */
-    FEAT_HV_NESTED_EAX, /* CPUID[4000_000A].EAX */
     FEAT_SVM,           /* CPUID[8000_000A].EDX */
     FEAT_XSAVE,         /* CPUID[EAX=0xd,ECX=1].EAX */
     FEAT_6_EAX,         /* CPUID[6].EAX */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 762c4f893467..d2c524376342 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -799,7 +799,8 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
 static struct {
     const char *desc;
     struct {
-        uint32_t fw;
+        uint32_t func;
+        int reg;
         uint32_t bits;
     } flags[2];
     uint64_t dependencies;
@@ -807,25 +808,25 @@ static struct {
     [HYPERV_FEAT_RELAXED] = {
         .desc = "relaxed timing (hv-relaxed)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_HYPERCALL_AVAILABLE},
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_RELAXED_TIMING_RECOMMENDED}
         }
     },
     [HYPERV_FEAT_VAPIC] = {
         .desc = "virtual APIC (hv-vapic)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_HYPERCALL_AVAILABLE | HV_APIC_ACCESS_AVAILABLE},
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_APIC_ACCESS_RECOMMENDED}
         }
     },
     [HYPERV_FEAT_TIME] = {
         .desc = "clocksources (hv-time)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_HYPERCALL_AVAILABLE | HV_TIME_REF_COUNT_AVAILABLE |
              HV_REFERENCE_TSC_AVAILABLE}
         }
@@ -833,42 +834,42 @@ static struct {
     [HYPERV_FEAT_CRASH] = {
         .desc = "crash MSRs (hv-crash)",
         .flags = {
-            {.fw = FEAT_HYPERV_EDX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EDX,
              .bits = HV_GUEST_CRASH_MSR_AVAILABLE}
         }
     },
     [HYPERV_FEAT_RESET] = {
         .desc = "reset MSR (hv-reset)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_RESET_AVAILABLE}
         }
     },
     [HYPERV_FEAT_VPINDEX] = {
         .desc = "VP_INDEX MSR (hv-vpindex)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_VP_INDEX_AVAILABLE}
         }
     },
     [HYPERV_FEAT_RUNTIME] = {
         .desc = "VP_RUNTIME MSR (hv-runtime)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_VP_RUNTIME_AVAILABLE}
         }
     },
     [HYPERV_FEAT_SYNIC] = {
         .desc = "synthetic interrupt controller (hv-synic)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_SYNIC_AVAILABLE}
         }
     },
     [HYPERV_FEAT_STIMER] = {
         .desc = "synthetic timers (hv-stimer)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_SYNTIMERS_AVAILABLE}
         },
         .dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_TIME)
@@ -876,23 +877,23 @@ static struct {
     [HYPERV_FEAT_FREQUENCIES] = {
         .desc = "frequency MSRs (hv-frequencies)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_ACCESS_FREQUENCY_MSRS},
-            {.fw = FEAT_HYPERV_EDX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EDX,
              .bits = HV_FREQUENCY_MSRS_AVAILABLE}
         }
     },
     [HYPERV_FEAT_REENLIGHTENMENT] = {
         .desc = "reenlightenment MSRs (hv-reenlightenment)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_ACCESS_REENLIGHTENMENTS_CONTROL}
         }
     },
     [HYPERV_FEAT_TLBFLUSH] = {
         .desc = "paravirtualized TLB flush (hv-tlbflush)",
         .flags = {
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_REMOTE_TLB_FLUSH_RECOMMENDED |
              HV_EX_PROCESSOR_MASKS_RECOMMENDED}
         },
@@ -901,7 +902,7 @@ static struct {
     [HYPERV_FEAT_EVMCS] = {
         .desc = "enlightened VMCS (hv-evmcs)",
         .flags = {
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_ENLIGHTENED_VMCS_RECOMMENDED}
         },
         .dependencies = BIT(HYPERV_FEAT_VAPIC)
@@ -909,7 +910,7 @@ static struct {
     [HYPERV_FEAT_IPI] = {
         .desc = "paravirtualized IPI (hv-ipi)",
         .flags = {
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_CLUSTER_IPI_RECOMMENDED |
              HV_EX_PROCESSOR_MASKS_RECOMMENDED}
         },
@@ -918,7 +919,7 @@ static struct {
     [HYPERV_FEAT_STIMER_DIRECT] = {
         .desc = "direct mode synthetic timers (hv-stimer-direct)",
         .flags = {
-            {.fw = FEAT_HYPERV_EDX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EDX,
              .bits = HV_STIMER_DIRECT_MODE_AVAILABLE}
         },
         .dependencies = BIT(HYPERV_FEAT_STIMER)
@@ -1064,48 +1065,6 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid_legacy(CPUState *cs)
     return cpuid;
 }
 
-static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
-{
-    struct kvm_cpuid_entry2 *entry;
-    uint32_t func;
-    int reg;
-
-    switch (fw) {
-    case FEAT_HYPERV_EAX:
-        reg = R_EAX;
-        func = HV_CPUID_FEATURES;
-        break;
-    case FEAT_HYPERV_EDX:
-        reg = R_EDX;
-        func = HV_CPUID_FEATURES;
-        break;
-    case FEAT_HV_RECOMM_EAX:
-        reg = R_EAX;
-        func = HV_CPUID_ENLIGHTMENT_INFO;
-        break;
-    default:
-        return -EINVAL;
-    }
-
-    entry = cpuid_find_entry(cpuid, func, 0);
-    if (!entry) {
-        return -ENOENT;
-    }
-
-    switch (reg) {
-    case R_EAX:
-        *r = entry->eax;
-        break;
-    case R_EDX:
-        *r = entry->edx;
-        break;
-    default:
-        return -EINVAL;
-    }
-
-    return 0;
-}
-
 static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
                                   int reg)
 {
@@ -1121,18 +1080,20 @@ static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
 
 static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
 {
-    uint32_t r, fw, bits;
-    int i;
+    uint32_t func, bits;
+    int i, reg;
 
     for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
-        fw = kvm_hyperv_properties[feature].flags[i].fw;
+
+        func = kvm_hyperv_properties[feature].flags[i].func;
+        reg = kvm_hyperv_properties[feature].flags[i].reg;
         bits = kvm_hyperv_properties[feature].flags[i].bits;
 
-        if (!fw) {
+        if (!func) {
             continue;
         }
 
-        if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
+        if ((hv_cpuid_get_host(cpuid, func, reg) & bits) != bits) {
             return false;
         }
     }
@@ -1182,7 +1143,7 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
     return 0;
 }
 
-static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t fw)
+static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
 {
     X86CPU *cpu = X86_CPU(cs);
     uint32_t r = 0;
@@ -1194,7 +1155,10 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t fw)
         }
 
         for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
-            if (kvm_hyperv_properties[i].flags[j].fw != fw) {
+            if (kvm_hyperv_properties[i].flags[j].func != func) {
+                continue;
+            }
+            if (kvm_hyperv_properties[i].flags[j].reg != reg) {
                 continue;
             }
 
@@ -1345,16 +1309,16 @@ static int hyperv_handle_properties(CPUState *cs,
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_FEATURES;
-    c->eax = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EAX);
-    c->ebx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EBX);
-    c->edx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EDX);
+    c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EAX);
+    c->ebx = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EBX);
+    c->edx = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EDX);
 
     /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
     c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_ENLIGHTMENT_INFO;
-    c->eax = hv_build_cpuid_leaf(cs, FEAT_HV_RECOMM_EAX);
+    c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX);
     c->ebx = cpu->hyperv_spinlock_attempts;
 
     if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 08/21] i386: introduce hv_cpuid_cache
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 07/21] i386: drop FEAT_HYPERV feature leaves Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 09/21] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids() Vitaly Kuznetsov
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Just like with cpuid_cache, it makes no sense to call
KVM_GET_SUPPORTED_HV_CPUID more than once and instead of (ab)using
env->features[] and/or trying to keep all the code in one place, it is
better to introduce persistent hv_cpuid_cache and hv_cpuid_get_host()
accessor to it.

Note, hv_cpuid_get_fw() is converted to using hv_cpuid_get_host()
just to be removed later with Hyper-V specific feature words.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 109 ++++++++++++++++++++++--------------------
 1 file changed, 56 insertions(+), 53 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d2c524376342..bcc5709ec467 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -127,6 +127,7 @@ static int has_exception_payload;
 static bool has_msr_mcg_ext_ctl;
 
 static struct kvm_cpuid2 *cpuid_cache;
+static struct kvm_cpuid2 *hv_cpuid_cache;
 static struct kvm_msr_list *kvm_feature_msrs;
 
 int kvm_has_pit_state2(void)
@@ -1065,10 +1066,25 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid_legacy(CPUState *cs)
     return cpuid;
 }
 
-static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
-                                  int reg)
+static uint32_t hv_cpuid_get_host(CPUState *cs, uint32_t func, int reg)
 {
     struct kvm_cpuid_entry2 *entry;
+    struct kvm_cpuid2 *cpuid;
+
+    if (hv_cpuid_cache) {
+        cpuid = hv_cpuid_cache;
+    } else {
+        if (kvm_check_extension(kvm_state, KVM_CAP_HYPERV_CPUID) > 0) {
+            cpuid = get_supported_hv_cpuid(cs);
+        } else {
+            cpuid = get_supported_hv_cpuid_legacy(cs);
+        }
+        hv_cpuid_cache = cpuid;
+    }
+
+    if (!cpuid) {
+        return 0;
+    }
 
     entry = cpuid_find_entry(cpuid, func, 0);
     if (!entry) {
@@ -1078,7 +1094,7 @@ static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
     return cpuid_entry_get_reg(entry, reg);
 }
 
-static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
+static bool hyperv_feature_supported(CPUState *cs, int feature)
 {
     uint32_t func, bits;
     int i, reg;
@@ -1093,7 +1109,7 @@ static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
             continue;
         }
 
-        if ((hv_cpuid_get_host(cpuid, func, reg) & bits) != bits) {
+        if ((hv_cpuid_get_host(cs, func, reg) & bits) != bits) {
             return false;
         }
     }
@@ -1101,8 +1117,7 @@ static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
     return true;
 }
 
-static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
-                                  int feature)
+static int hv_cpuid_check_and_set(CPUState *cs, int feature)
 {
     X86CPU *cpu = X86_CPU(cs);
     uint64_t deps;
@@ -1125,7 +1140,7 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
         deps &= ~(1ull << dep_feat);
     }
 
-    if (!hyperv_feature_supported(cpuid, feature)) {
+    if (!hyperv_feature_supported(cs, feature)) {
         if (hyperv_feat_enabled(cpu, feature)) {
             fprintf(stderr,
                     "Hyper-V %s is not supported by kernel\n",
@@ -1178,7 +1193,6 @@ static int hyperv_handle_properties(CPUState *cs,
                                     struct kvm_cpuid_entry2 *cpuid_ent)
 {
     X86CPU *cpu = X86_CPU(cs);
-    struct kvm_cpuid2 *cpuid;
     struct kvm_cpuid_entry2 *c;
     uint32_t cpuid_i = 0;
     int r;
@@ -1204,71 +1218,65 @@ static int hyperv_handle_properties(CPUState *cs,
         }
     }
 
-    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_CPUID) > 0) {
-        cpuid = get_supported_hv_cpuid(cs);
-    } else {
-        cpuid = get_supported_hv_cpuid_legacy(cs);
-    }
-
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
         cpu->hyperv_vendor_id[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_ECX);
         cpu->hyperv_vendor_id[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EDX);
         cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
                                        sizeof(cpu->hyperv_vendor_id) + 1);
         memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
                sizeof(cpu->hyperv_vendor_id));
 
         cpu->hyperv_interface_id[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EAX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EAX);
         cpu->hyperv_interface_id[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EBX);
         cpu->hyperv_interface_id[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_ECX);
         cpu->hyperv_interface_id[3] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EDX);
 
         cpu->hyperv_version_id[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EAX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EAX);
         cpu->hyperv_version_id[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EBX);
         cpu->hyperv_version_id[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_ECX);
         cpu->hyperv_version_id[3] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EDX);
 
-        cpu->hv_max_vps = hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS,
+        cpu->hv_max_vps = hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS,
                                             R_EAX);
         cpu->hyperv_limits[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS, R_EBX);
         cpu->hyperv_limits[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS, R_ECX);
         cpu->hyperv_limits[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS, R_EDX);
 
         cpu->hyperv_spinlock_attempts =
-            hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EBX);
     }
 
     /* Features */
-    r = hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RELAXED);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_VAPIC);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_TIME);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_CRASH);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RESET);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_VPINDEX);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RUNTIME);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_SYNIC);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_STIMER);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_FREQUENCIES);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_REENLIGHTENMENT);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_TLBFLUSH);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_EVMCS);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_IPI);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_STIMER_DIRECT);
+    r = hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT);
 
     /* Additional dependencies not covered by kvm_hyperv_properties[] */
     if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNIC) &&
@@ -1281,8 +1289,7 @@ static int hyperv_handle_properties(CPUState *cs,
     }
 
     if (r) {
-        r = -ENOSYS;
-        goto free;
+        return -ENOSYS;
     }
 
     c = &cpuid_ent[cpuid_i++];
@@ -1324,7 +1331,7 @@ static int hyperv_handle_properties(CPUState *cs,
     if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
         c->eax |= HV_NO_NONARCH_CORESHARING;
     } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
-        c->eax |= hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EAX) &
+        c->eax |= hv_cpuid_get_host(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX) &
             HV_NO_NONARCH_CORESHARING;
     }
 
@@ -1349,12 +1356,8 @@ static int hyperv_handle_properties(CPUState *cs,
         c->function = HV_CPUID_NESTED_FEATURES;
         c->eax = cpu->hyperv_nested[0];
     }
-    r = cpuid_i;
 
-free:
-    g_free(cpuid);
-
-    return r;
+    return cpuid_i;
 }
 
 static Error *hv_passthrough_mig_blocker;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 09/21] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids()
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (7 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 08/21] i386: introduce hv_cpuid_cache Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 10/21] i386: move eVMCS enablement to hyperv_init_vcpu() Vitaly Kuznetsov
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

The intention is to call hyperv_expand_features() early, before vCPUs
are created and use the acquired data later when we set guest visible
CPUID data.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index bcc5709ec467..893c9c515fbb 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1185,16 +1185,15 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
 }
 
 /*
- * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent in
- * case of success, errno < 0 in case of failure and 0 when no Hyper-V
- * extentions are enabled.
+ * Expand Hyper-V CPU features. In partucular, check that all the requested
+ * features are supported by the host and the sanity of the configuration
+ * (that all the required dependencies are included). Also, this takes care
+ * of 'hv_passthrough' mode and fills the environment with all supported
+ * Hyper-V features.
  */
-static int hyperv_handle_properties(CPUState *cs,
-                                    struct kvm_cpuid_entry2 *cpuid_ent)
+static int hyperv_expand_features(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
-    struct kvm_cpuid_entry2 *c;
-    uint32_t cpuid_i = 0;
     int r;
 
     if (!hyperv_enabled(cpu))
@@ -1292,6 +1291,19 @@ static int hyperv_handle_properties(CPUState *cs,
         return -ENOSYS;
     }
 
+    return 0;
+}
+
+/*
+ * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent.
+ */
+static int hyperv_fill_cpuids(CPUState *cs,
+                              struct kvm_cpuid_entry2 *cpuid_ent)
+{
+    X86CPU *cpu = X86_CPU(cs);
+    struct kvm_cpuid_entry2 *c;
+    uint32_t cpuid_i = 0;
+
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
     c->eax = hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ?
@@ -1499,11 +1511,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
     env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
     /* Paravirtualization CPUIDs */
-    r = hyperv_handle_properties(cs, cpuid_data.entries);
+    r = hyperv_expand_features(cs);
     if (r < 0) {
         return r;
-    } else if (r > 0) {
-        cpuid_i = r;
+    }
+
+    if (hyperv_enabled(cpu)) {
+        cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
         kvm_base = KVM_CPUID_SIGNATURE_NEXT;
         has_msr_hv_hypercall = true;
     }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 10/21] i386: move eVMCS enablement to hyperv_init_vcpu()
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (8 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 09/21] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids() Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 11/21] i386: switch hyperv_expand_features() to using error_setg() Vitaly Kuznetsov
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

hyperv_expand_features() will be called before we create vCPU so
evmcs enablement should go away. hyperv_init_vcpu() looks like the
right place.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 60 ++++++++++++++++++++++++++-----------------
 1 file changed, 37 insertions(+), 23 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 893c9c515fbb..4cab175fa95c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -961,6 +961,7 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
 {
     struct kvm_cpuid2 *cpuid;
     int max = 7; /* 0x40000000..0x40000005, 0x4000000A */
+    int i;
 
     /*
      * When the buffer is too small, KVM_GET_SUPPORTED_HV_CPUID fails with
@@ -970,6 +971,22 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
     while ((cpuid = try_get_hv_cpuid(cs, max)) == NULL) {
         max++;
     }
+
+    /*
+     * KVM_GET_SUPPORTED_HV_CPUID does not set EVMCS CPUID bit before
+     * KVM_CAP_HYPERV_ENLIGHTENED_VMCS is enabled but we want to get the
+     * information early, just check for the capability and set the bit
+     * manually.
+     */
+    if (kvm_check_extension(cs->kvm_state,
+                            KVM_CAP_HYPERV_ENLIGHTENED_VMCS) > 0) {
+        for (i = 0; i < cpuid->nent; i++) {
+            if (cpuid->entries[i].function == HV_CPUID_ENLIGHTMENT_INFO) {
+                cpuid->entries[i].eax |= HV_ENLIGHTENED_VMCS_RECOMMENDED;
+            }
+        }
+    }
+
     return cpuid;
 }
 
@@ -1199,24 +1216,6 @@ static int hyperv_expand_features(CPUState *cs)
     if (!hyperv_enabled(cpu))
         return 0;
 
-    if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ||
-        cpu->hyperv_passthrough) {
-        uint16_t evmcs_version;
-
-        r = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0,
-                                (uintptr_t)&evmcs_version);
-
-        if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) && r) {
-            fprintf(stderr, "Hyper-V %s is not supported by kernel\n",
-                    kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
-            return -ENOSYS;
-        }
-
-        if (!r) {
-            cpu->hyperv_nested[0] = evmcs_version;
-        }
-    }
-
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
             hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
@@ -1453,6 +1452,21 @@ static int hyperv_init_vcpu(X86CPU *cpu)
         }
     }
 
+    if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+        uint16_t evmcs_version;
+
+        ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0,
+                                  (uintptr_t)&evmcs_version);
+
+        if (ret < 0) {
+            fprintf(stderr, "Hyper-V %s is not supported by kernel\n",
+                    kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
+            return ret;
+        }
+
+        cpu->hyperv_nested[0] = evmcs_version;
+    }
+
     return 0;
 }
 
@@ -1517,6 +1531,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     }
 
     if (hyperv_enabled(cpu)) {
+        r = hyperv_init_vcpu(cpu);
+        if (r) {
+            return r;
+        }
+
         cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
         kvm_base = KVM_CPUID_SIGNATURE_NEXT;
         has_msr_hv_hypercall = true;
@@ -1866,11 +1885,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
     kvm_init_msrs(cpu);
 
-    r = hyperv_init_vcpu(cpu);
-    if (r) {
-        goto fail;
-    }
-
     return 0;
 
  fail:
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 11/21] i386: switch hyperv_expand_features() to using error_setg()
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (9 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 10/21] i386: move eVMCS enablement to hyperv_init_vcpu() Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 12/21] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size Vitaly Kuznetsov
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Use standard error_setg() mechanism in hyperv_expand_features().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 101 +++++++++++++++++++++++++-----------------
 1 file changed, 61 insertions(+), 40 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4cab175fa95c..3c1e84576184 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1134,7 +1134,7 @@ static bool hyperv_feature_supported(CPUState *cs, int feature)
     return true;
 }
 
-static int hv_cpuid_check_and_set(CPUState *cs, int feature)
+static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
 {
     X86CPU *cpu = X86_CPU(cs);
     uint64_t deps;
@@ -1148,20 +1148,18 @@ static int hv_cpuid_check_and_set(CPUState *cs, int feature)
     while (deps) {
         dep_feat = ctz64(deps);
         if (!(hyperv_feat_enabled(cpu, dep_feat))) {
-                fprintf(stderr,
-                        "Hyper-V %s requires Hyper-V %s\n",
-                        kvm_hyperv_properties[feature].desc,
-                        kvm_hyperv_properties[dep_feat].desc);
-                return 1;
+            error_setg(errp, "Hyper-V %s requires Hyper-V %s",
+                       kvm_hyperv_properties[feature].desc,
+                       kvm_hyperv_properties[dep_feat].desc);
+            return 1;
         }
         deps &= ~(1ull << dep_feat);
     }
 
     if (!hyperv_feature_supported(cs, feature)) {
         if (hyperv_feat_enabled(cpu, feature)) {
-            fprintf(stderr,
-                    "Hyper-V %s is not supported by kernel\n",
-                    kvm_hyperv_properties[feature].desc);
+            error_setg(errp, "Hyper-V %s is not supported by kernel",
+                       kvm_hyperv_properties[feature].desc);
             return 1;
         } else {
             return 0;
@@ -1208,13 +1206,12 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
  * of 'hv_passthrough' mode and fills the environment with all supported
  * Hyper-V features.
  */
-static int hyperv_expand_features(CPUState *cs)
+static void hyperv_expand_features(CPUState *cs, Error **errp)
 {
     X86CPU *cpu = X86_CPU(cs);
-    int r;
 
     if (!hyperv_enabled(cpu))
-        return 0;
+        return;
 
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
@@ -1260,37 +1257,60 @@ static int hyperv_expand_features(CPUState *cs)
     }
 
     /* Features */
-    r = hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT);
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT, errp)) {
+        return;
+    }
 
     /* Additional dependencies not covered by kvm_hyperv_properties[] */
     if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNIC) &&
         !cpu->hyperv_synic_kvm_only &&
         !hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX)) {
-        fprintf(stderr, "Hyper-V %s requires Hyper-V %s\n",
-                kvm_hyperv_properties[HYPERV_FEAT_SYNIC].desc,
-                kvm_hyperv_properties[HYPERV_FEAT_VPINDEX].desc);
-        r |= 1;
-    }
-
-    if (r) {
-        return -ENOSYS;
+        error_setg(errp, "Hyper-V %s requires Hyper-V %s",
+                   kvm_hyperv_properties[HYPERV_FEAT_SYNIC].desc,
+                   kvm_hyperv_properties[HYPERV_FEAT_VPINDEX].desc);
     }
-
-    return 0;
 }
 
 /*
@@ -1525,9 +1545,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
     env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
     /* Paravirtualization CPUIDs */
-    r = hyperv_expand_features(cs);
-    if (r < 0) {
-        return r;
+    hyperv_expand_features(cs, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+        return -ENOSYS;
     }
 
     if (hyperv_enabled(cpu)) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 12/21] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (10 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 11/21] i386: switch hyperv_expand_features() to using error_setg() Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 13/21] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one Vitaly Kuznetsov
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

SYNDBG leaves were recently (Linux-5.8) added to KVM but we haven't
updated the expected size of KVM_GET_SUPPORTED_HV_CPUID output in
KVM so we now make serveral tries before succeeding. Update the
default.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 3c1e84576184..f4edfbb10879 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -960,7 +960,8 @@ static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
 static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
 {
     struct kvm_cpuid2 *cpuid;
-    int max = 7; /* 0x40000000..0x40000005, 0x4000000A */
+    /* 0x40000000..0x40000005, 0x4000000A, 0x40000080..0x40000080 leaves */
+    int max = 10;
     int i;
 
     /*
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 13/21] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (11 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 12/21] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 14/21] i386: use global kvm_state in hyperv_enabled() check Vitaly Kuznetsov
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

KVM_GET_SUPPORTED_HV_CPUID was made a system wide ioctl which can be called
prior to creating vCPUs and we are going to use that to expand Hyper-V cpu
features early. Use it when it is supported by KVM.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f4edfbb10879..48484592fc03 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -927,7 +927,8 @@ static struct {
     },
 };
 
-static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
+static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
+                                           bool do_sys_ioctl)
 {
     struct kvm_cpuid2 *cpuid;
     int r, size;
@@ -936,7 +937,11 @@ static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
     cpuid = g_malloc0(size);
     cpuid->nent = max;
 
-    r = kvm_vcpu_ioctl(cs, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
+    if (do_sys_ioctl) {
+        r = kvm_ioctl(kvm_state, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
+    } else {
+        r = kvm_vcpu_ioctl(cs, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
+    }
     if (r == 0 && cpuid->nent >= max) {
         r = -E2BIG;
     }
@@ -963,13 +968,17 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
     /* 0x40000000..0x40000005, 0x4000000A, 0x40000080..0x40000080 leaves */
     int max = 10;
     int i;
+    bool do_sys_ioctl;
+
+    do_sys_ioctl =
+        kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID) > 0;
 
     /*
      * When the buffer is too small, KVM_GET_SUPPORTED_HV_CPUID fails with
      * -E2BIG, however, it doesn't report back the right size. Keep increasing
      * it and re-trying until we succeed.
      */
-    while ((cpuid = try_get_hv_cpuid(cs, max)) == NULL) {
+    while ((cpuid = try_get_hv_cpuid(cs, max, do_sys_ioctl)) == NULL) {
         max++;
     }
 
@@ -979,7 +988,7 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
      * information early, just check for the capability and set the bit
      * manually.
      */
-    if (kvm_check_extension(cs->kvm_state,
+    if (!do_sys_ioctl && kvm_check_extension(cs->kvm_state,
                             KVM_CAP_HYPERV_ENLIGHTENED_VMCS) > 0) {
         for (i = 0; i < cpuid->nent; i++) {
             if (cpuid->entries[i].function == HV_CPUID_ENLIGHTMENT_INFO) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 14/21] i386: use global kvm_state in hyperv_enabled() check
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (12 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 13/21] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 15/21] i386: expand Hyper-V features during CPU feature expansion time Vitaly Kuznetsov
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

There is no need to use vCPU-specific kvm state in hyperv_enabled() check
and we need to do that when feature expansion happens early, before vCPU
specific KVM state is created.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 48484592fc03..47fc564747a3 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -714,8 +714,7 @@ unsigned long kvm_arch_vcpu_id(CPUState *cs)
 
 static bool hyperv_enabled(X86CPU *cpu)
 {
-    CPUState *cs = CPU(cpu);
-    return kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV) > 0 &&
+    return kvm_check_extension(kvm_state, KVM_CAP_HYPERV) > 0 &&
         ((cpu->hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_NOTIFY) ||
          cpu->hyperv_features || cpu->hyperv_passthrough);
 }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 15/21] i386: expand Hyper-V features during CPU feature expansion time
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (13 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 14/21] i386: use global kvm_state in hyperv_enabled() check Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we
need to expand and set the corresponding CPUID leaves early. Modify
x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V
specific kvm_hv_get_supported_cpuid() instead of
kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid()
as Hyper-V specific CPUID leaves intersect with KVM's.

Note, early expansion will only happen when KVM supports system wide
KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c          |  4 ++++
 target/i386/kvm/kvm-stub.c |  5 +++++
 target/i386/kvm/kvm.c      | 15 ++++++++++++---
 target/i386/kvm/kvm_i386.h |  1 +
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f0f826997ba0..c4e8863c7ca0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6396,6 +6396,10 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
     if (env->cpuid_xlevel2 == UINT32_MAX) {
         env->cpuid_xlevel2 = env->cpuid_min_xlevel2;
     }
+
+    if (kvm_enabled()) {
+        kvm_hyperv_expand_features(cpu, errp);
+    }
 }
 
 /*
diff --git a/target/i386/kvm/kvm-stub.c b/target/i386/kvm/kvm-stub.c
index 92f49121b8fa..7f175faa3abd 100644
--- a/target/i386/kvm/kvm-stub.c
+++ b/target/i386/kvm/kvm-stub.c
@@ -39,3 +39,8 @@ bool kvm_hv_vpindex_settable(void)
 {
     return false;
 }
+
+void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
+{
+    return;
+}
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 47fc564747a3..30013f0d7cee 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1215,13 +1215,22 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
  * of 'hv_passthrough' mode and fills the environment with all supported
  * Hyper-V features.
  */
-static void hyperv_expand_features(CPUState *cs, Error **errp)
+void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
 {
-    X86CPU *cpu = X86_CPU(cs);
+    CPUState *cs = CPU(cpu);
 
     if (!hyperv_enabled(cpu))
         return;
 
+    /*
+     * When kvm_hyperv_expand_features is called at CPU feature expansion
+     * time per-CPU kvm_state is not available yet so we can only proceed
+     * when KVM_CAP_SYS_HYPERV_CPUID is supported.
+     */
+    if (!cs->kvm_state &&
+        !kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID))
+        return;
+
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
             hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
@@ -1554,7 +1563,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
     env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
     /* Paravirtualization CPUIDs */
-    hyperv_expand_features(cs, &local_err);
+    kvm_hyperv_expand_features(cpu, &local_err);
     if (local_err) {
         error_report_err(local_err);
         return -ENOSYS;
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index dc725083891c..f1176491051d 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -47,6 +47,7 @@ bool kvm_has_x2apic_api(void);
 bool kvm_has_waitpkg(void);
 
 bool kvm_hv_vpindex_settable(void);
+void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
 
 uint64_t kvm_swizzle_msi_ext_dest_id(uint64_t address);
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (14 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 15/21] i386: expand Hyper-V features during CPU feature expansion time Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-11 17:35   ` Igor Mammedov
  2021-02-10 16:40 ` [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Sometimes we'd like to know which features were explicitly enabled and which
were explicitly disabled on the command line. E.g. it seems logical to handle
'hv_passthrough,hv_feature=off' as "enable everything supported by the host
except for hv_feature" but this doesn't seem to be possible with the current
'hyperv_features' bit array. Introduce 'hv_features_on'/'hv_features_off'
add-ons and track explicit enablement/disablement there.

Note, it doesn't seem to be possible to fill 'hyperv_features' array during
CPU creation time when 'hv-passthrough' is specified and we're running on
an older kernel without KVM_CAP_SYS_HYPERV_CPUID support. To get the list
of the supported Hyper-V features we need to actually create KVM VCPU and
this happens much later.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c | 237 ++++++++++++++++++++++++++++++++++++++++------
 target/i386/cpu.h |   2 +
 2 files changed, 209 insertions(+), 30 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c4e8863c7ca0..e8a004c39d04 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4553,6 +4553,178 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
     cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
 }
 
+static bool x86_hv_feature_get(Object *obj, int feature)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    return cpu->hyperv_features & BIT(feature);
+}
+
+static void x86_hv_feature_set(Object *obj, bool value, int feature)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    if (value) {
+        cpu->hyperv_features |= BIT(feature);
+        cpu->hyperv_features_on |= BIT(feature);
+        cpu->hyperv_features_off &= ~BIT(feature);
+    } else {
+        cpu->hyperv_features &= ~BIT(feature);
+        cpu->hyperv_features_on &= ~BIT(feature);
+        cpu->hyperv_features_off |= BIT(feature);
+    }
+}
+
+static bool x86_hv_relaxed_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_RELAXED);
+}
+
+static void x86_hv_relaxed_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_RELAXED);
+}
+
+static bool x86_hv_vapic_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_VAPIC);
+}
+
+static void x86_hv_vapic_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_VAPIC);
+}
+
+static bool x86_hv_time_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_TIME);
+}
+
+static void x86_hv_time_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_TIME);
+}
+
+static bool x86_hv_crash_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_CRASH);
+}
+
+static void x86_hv_crash_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_CRASH);
+}
+
+static bool x86_hv_reset_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_RESET);
+}
+
+static void x86_hv_reset_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_RESET);
+}
+
+static bool x86_hv_vpindex_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_VPINDEX);
+}
+
+static void x86_hv_vpindex_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_VPINDEX);
+}
+
+static bool x86_hv_runtime_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_RUNTIME);
+}
+
+static void x86_hv_runtime_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_RUNTIME);
+}
+
+static bool x86_hv_synic_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_SYNIC);
+}
+
+static void x86_hv_synic_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_SYNIC);
+}
+
+static bool x86_hv_stimer_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_STIMER);
+}
+
+static void x86_hv_stimer_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER);
+}
+
+static bool x86_hv_frequencies_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_FREQUENCIES);
+}
+
+static void x86_hv_frequencies_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_FREQUENCIES);
+}
+
+static bool x86_hv_reenlightenment_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_REENLIGHTENMENT);
+}
+
+static void x86_hv_reenlightenment_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_REENLIGHTENMENT);
+}
+
+static bool x86_hv_tlbflush_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_TLBFLUSH);
+}
+
+static void x86_hv_tlbflush_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_TLBFLUSH);
+}
+
+static bool x86_hv_evmcs_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_EVMCS);
+}
+
+static void x86_hv_evmcs_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_EVMCS);
+}
+
+static bool x86_hv_ipi_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_IPI);
+}
+
+static void x86_hv_ipi_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_IPI);
+}
+
+static bool x86_hv_stimer_direct_get(Object *obj, Error **errp)
+{
+    return x86_hv_feature_get(obj, HYPERV_FEAT_STIMER_DIRECT);
+}
+
+static void x86_hv_stimer_direct_set(Object *obj, bool value, Error **errp)
+{
+    x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER_DIRECT);
+}
+
 /* Generic getter for "feature-words" and "filtered-features" properties */
 static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
                                       const char *name, void *opaque,
@@ -7107,36 +7279,6 @@ static Property x86_cpu_properties[] = {
 
     DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
                        HYPERV_SPINLOCK_NEVER_NOTIFY),
-    DEFINE_PROP_BIT64("hv-relaxed", X86CPU, hyperv_features,
-                      HYPERV_FEAT_RELAXED, 0),
-    DEFINE_PROP_BIT64("hv-vapic", X86CPU, hyperv_features,
-                      HYPERV_FEAT_VAPIC, 0),
-    DEFINE_PROP_BIT64("hv-time", X86CPU, hyperv_features,
-                      HYPERV_FEAT_TIME, 0),
-    DEFINE_PROP_BIT64("hv-crash", X86CPU, hyperv_features,
-                      HYPERV_FEAT_CRASH, 0),
-    DEFINE_PROP_BIT64("hv-reset", X86CPU, hyperv_features,
-                      HYPERV_FEAT_RESET, 0),
-    DEFINE_PROP_BIT64("hv-vpindex", X86CPU, hyperv_features,
-                      HYPERV_FEAT_VPINDEX, 0),
-    DEFINE_PROP_BIT64("hv-runtime", X86CPU, hyperv_features,
-                      HYPERV_FEAT_RUNTIME, 0),
-    DEFINE_PROP_BIT64("hv-synic", X86CPU, hyperv_features,
-                      HYPERV_FEAT_SYNIC, 0),
-    DEFINE_PROP_BIT64("hv-stimer", X86CPU, hyperv_features,
-                      HYPERV_FEAT_STIMER, 0),
-    DEFINE_PROP_BIT64("hv-frequencies", X86CPU, hyperv_features,
-                      HYPERV_FEAT_FREQUENCIES, 0),
-    DEFINE_PROP_BIT64("hv-reenlightenment", X86CPU, hyperv_features,
-                      HYPERV_FEAT_REENLIGHTENMENT, 0),
-    DEFINE_PROP_BIT64("hv-tlbflush", X86CPU, hyperv_features,
-                      HYPERV_FEAT_TLBFLUSH, 0),
-    DEFINE_PROP_BIT64("hv-evmcs", X86CPU, hyperv_features,
-                      HYPERV_FEAT_EVMCS, 0),
-    DEFINE_PROP_BIT64("hv-ipi", X86CPU, hyperv_features,
-                      HYPERV_FEAT_IPI, 0),
-    DEFINE_PROP_BIT64("hv-stimer-direct", X86CPU, hyperv_features,
-                      HYPERV_FEAT_STIMER_DIRECT, 0),
     DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
                             hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
     DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
@@ -7283,6 +7425,41 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
                               x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
 #endif
 
+    object_class_property_add_bool(oc, "hv-relaxed",
+                                   x86_hv_relaxed_get, x86_hv_relaxed_set);
+    object_class_property_add_bool(oc, "hv-vapic",
+                                   x86_hv_vapic_get, x86_hv_vapic_set);
+    object_class_property_add_bool(oc, "hv-time",
+                                   x86_hv_time_get, x86_hv_time_set);
+    object_class_property_add_bool(oc, "hv-crash",
+                                   x86_hv_crash_get, x86_hv_crash_set);
+    object_class_property_add_bool(oc, "hv-reset",
+                                   x86_hv_reset_get, x86_hv_reset_set);
+    object_class_property_add_bool(oc, "hv-vpindex",
+                                   x86_hv_vpindex_get, x86_hv_vpindex_set);
+    object_class_property_add_bool(oc, "hv-runtime",
+                                   x86_hv_runtime_get, x86_hv_runtime_set);
+    object_class_property_add_bool(oc, "hv-synic",
+                                   x86_hv_synic_get, x86_hv_synic_set);
+    object_class_property_add_bool(oc, "hv-stimer",
+                                   x86_hv_stimer_get, x86_hv_stimer_set);
+    object_class_property_add_bool(oc, "hv-frequencies",
+                                   x86_hv_frequencies_get,
+                                   x86_hv_frequencies_set);
+    object_class_property_add_bool(oc, "hv-reenlightenment",
+                                   x86_hv_reenlightenment_get,
+                                   x86_hv_reenlightenment_set);
+    object_class_property_add_bool(oc, "hv-tlbflush",
+                                   x86_hv_tlbflush_get, x86_hv_tlbflush_set);
+    object_class_property_add_bool(oc, "hv-evmcs",
+                              x86_hv_evmcs_get,
+                              x86_hv_evmcs_set);
+    object_class_property_add_bool(oc, "hv-ipi",
+                                   x86_hv_ipi_get, x86_hv_ipi_set);
+    object_class_property_add_bool(oc, "hv-stimer-direct",
+                                   x86_hv_stimer_direct_get,
+                                   x86_hv_stimer_direct_set);
+
     for (w = 0; w < FEATURE_WORDS; w++) {
         int bitnr;
         for (bitnr = 0; bitnr < 64; bitnr++) {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7ea14822aab5..b4fbd46f0fc9 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1667,6 +1667,8 @@ struct X86CPU {
     char *hyperv_vendor;
     bool hyperv_synic_kvm_only;
     uint64_t hyperv_features;
+    uint64_t hyperv_features_on;
+    uint64_t hyperv_features_off;
     bool hyperv_passthrough;
     OnOffAuto hyperv_no_nonarch_cs;
     uint32_t hyperv_vendor_id[3];
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (15 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-11 17:14   ` Igor Mammedov
  2021-02-10 16:40 ` [PATCH v4 18/21] i386: be more picky about implicit 'hv-evmcs' enablement Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Currently, we support 'hv-passthrough,hv-feature=on' enablement, this
is supposed to mean "hv-feature is mandatory, don't start without it". Add
support for 'hv-passthrough,hv-feature=off' meaning "enable everything
supported by the host except for hv-feature".

While on it, make 'hv-passthrough' parse semantics in-line with other
options in qemu: when specified, it overrides what was previously set with
what's supported by the host. This can later be modified with 'hv-feature=on'/
'hv-feature=off'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c     | 28 +++++++++++++++++++++++++++-
 target/i386/kvm/kvm.c |  4 ++++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index e8a004c39d04..f8df2caed779 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4725,6 +4725,29 @@ static void x86_hv_stimer_direct_set(Object *obj, bool value, Error **errp)
     x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER_DIRECT);
 }
 
+static bool x86_hv_passthrough_get(Object *obj, Error **errp)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    return cpu->hyperv_passthrough;
+}
+
+static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    cpu->hyperv_passthrough = value;
+
+    /* hv-passthrough overrides everything with what's supported by the host */
+    if (value) {
+        cpu->hyperv_features = 0;
+        cpu->hyperv_features_on = 0;
+        cpu->hyperv_features_off = 0;
+    }
+
+    return;
+}
+
 /* Generic getter for "feature-words" and "filtered-features" properties */
 static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
                                       const char *name, void *opaque,
@@ -7281,7 +7304,6 @@ static Property x86_cpu_properties[] = {
                        HYPERV_SPINLOCK_NEVER_NOTIFY),
     DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
                             hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
-    DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
 
     DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
     DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
@@ -7460,6 +7482,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
                                    x86_hv_stimer_direct_get,
                                    x86_hv_stimer_direct_set);
 
+    object_class_property_add_bool(oc, "hv-passthrough",
+                                   x86_hv_passthrough_get,
+                                   x86_hv_passthrough_set);
+
     for (w = 0; w < FEATURE_WORDS; w++) {
         int bitnr;
         for (bitnr = 0; bitnr < 64; bitnr++) {
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 30013f0d7cee..fca088d4d3b5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1153,6 +1153,10 @@ static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
         return 0;
     }
 
+    if (cpu->hyperv_passthrough && (cpu->hyperv_features_off & BIT(feature))) {
+        return 0;
+    }
+
     deps = kvm_hyperv_properties[feature].dependencies;
     while (deps) {
         dep_feat = ctz64(deps);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 18/21] i386: be more picky about implicit 'hv-evmcs' enablement
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (16 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 19/21] i386: introduce kvm_hv_evmcs_available() Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Enlightened VMCS is the only (currently implemented in QEMU) Hyper-V
feature with hardware dependencies, it pairs with Intel VMX. It doesn't
seem right to enable this feature when VMX wasn't enabled in the guest and
when it wasn't explicitly requested on the command line. Currently, the
only possible scenario is 'hv-passthrough' which will enable 'hv-evmcs'
when the host supports it, regardless of guest VMX exposure. The upcoming
'hv-default' should also avoid enabling 'hv-evmcs' without VMX.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index fca088d4d3b5..480908b2463a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1315,8 +1315,17 @@ void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
     if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH, errp)) {
         return;
     }
-    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS, errp)) {
-        return;
+    /*
+     * 'hv-evmcs' is not enabled when it wasn't explicitly requested and guest
+     * CPU lacks VMX.
+     */
+    if (cpu_has_vmx(&cpu->env) ||
+        (cpu->hyperv_features_on & BIT(HYPERV_FEAT_EVMCS))) {
+        if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS, errp)) {
+            return;
+        }
+    } else {
+        cpu->hyperv_features &= ~BIT(HYPERV_FEAT_EVMCS);
     }
     if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI, errp)) {
         return;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 19/21] i386: introduce kvm_hv_evmcs_available()
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (17 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 18/21] i386: be more picky about implicit 'hv-evmcs' enablement Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:40 ` [PATCH v4 20/21] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Enlightened VMCS feature is hardware specific, it is only supported on
Intel CPUs. Introduce a simple kvm_hv_evmcs_available() helper, it will
be used to filter out 'hv_evmcs' when 'hyperv=on' option is added to
X86MachineClass.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm-stub.c | 5 +++++
 target/i386/kvm/kvm.c      | 8 ++++++++
 target/i386/kvm/kvm_i386.h | 1 +
 3 files changed, 14 insertions(+)

diff --git a/target/i386/kvm/kvm-stub.c b/target/i386/kvm/kvm-stub.c
index 7f175faa3abd..4e486f41a60a 100644
--- a/target/i386/kvm/kvm-stub.c
+++ b/target/i386/kvm/kvm-stub.c
@@ -44,3 +44,8 @@ void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
 {
     return;
 }
+
+bool kvm_hv_evmcs_available(void)
+{
+    return false;
+}
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 480908b2463a..6c26b2091d4a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -96,6 +96,7 @@ static bool has_msr_hv_crash;
 static bool has_msr_hv_reset;
 static bool has_msr_hv_vpindex;
 static bool hv_vpindex_settable;
+static bool hv_evmcs_available;
 static bool has_msr_hv_runtime;
 static bool has_msr_hv_synic;
 static bool has_msr_hv_stimer;
@@ -195,6 +196,11 @@ bool kvm_hv_vpindex_settable(void)
     return hv_vpindex_settable;
 }
 
+bool kvm_hv_evmcs_available(void)
+{
+    return hv_evmcs_available;
+}
+
 static int kvm_get_tsc(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
@@ -2235,6 +2241,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     has_pit_state2 = kvm_check_extension(s, KVM_CAP_PIT_STATE2);
 
     hv_vpindex_settable = kvm_check_extension(s, KVM_CAP_HYPERV_VP_INDEX);
+    hv_evmcs_available =
+        kvm_check_extension(s, KVM_CAP_HYPERV_ENLIGHTENED_VMCS);
 
     has_exception_payload = kvm_check_extension(s, KVM_CAP_EXCEPTION_PAYLOAD);
     if (has_exception_payload) {
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index f1176491051d..0fa00511be27 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -48,6 +48,7 @@ bool kvm_has_waitpkg(void);
 
 bool kvm_hv_vpindex_settable(void);
 void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
+bool kvm_hv_evmcs_available(void);
 
 uint64_t kvm_swizzle_msi_ext_dest_id(uint64_t address);
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 20/21] i386: provide simple 'hv-default=on' option
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (18 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 19/21] i386: introduce kvm_hv_evmcs_available() Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-11 17:23   ` Igor Mammedov
  2021-02-10 16:40 ` [PATCH v4 21/21] qtest/hyperv: Introduce a simple hyper-v test Vitaly Kuznetsov
  2021-02-10 16:56 ` [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Daniel P. Berrangé
  21 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
requires listing all currently supported enlightenments ("hv-*" CPU
features) explicitly. We do have 'hv-passthrough' mode enabling
everything but it can't be used in production as it prevents migration.

Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
Hyper-V enlightenments. Later, when new enlightenments get implemented,
compat_props mechanism will be used to disable them for legacy machine types,
this will keep 'hv-default=on' configurations migratable.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 docs/hyperv.txt   | 16 ++++++++++++---
 target/i386/cpu.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++
 target/i386/cpu.h |  3 +++
 3 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 5df00da54fc4..a54c066cab09 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
 
 2. Setup
 =========
-No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
-QEMU, individual enlightenments can be enabled through CPU flags, e.g:
+All currently supported Hyper-V enlightenments can be enabled by specifying
+'hv-default=on' CPU flag:
 
-  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
+  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
+
+Alternatively, it is possible to do fine-grained enablement through CPU flags,
+e.g:
+
+  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...
+
+It is also possible to disable individual enlightenments from the default list,
+this can be used for debugging purposes:
+
+  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
 
 Sometimes there are dependencies between enlightenments, QEMU is supposed to
 check that the supplied configuration is sane.
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f8df2caed779..013aa60272d8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4736,6 +4736,12 @@ static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
 {
     X86CPU *cpu = X86_CPU(obj);
 
+    if (cpu->hyperv_default) {
+        error_setg(errp,
+                   "'hv-default' and 'hv-paththrough' are mutually exclusive");
+        return;
+    }
+
     cpu->hyperv_passthrough = value;
 
     /* hv-passthrough overrides everything with what's supported by the host */
@@ -4748,6 +4754,33 @@ static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
     return;
 }
 
+static bool x86_hv_default_get(Object *obj, Error **errp)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    return cpu->hyperv_default;
+}
+
+static void x86_hv_default_set(Object *obj, bool value, Error **errp)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    if (cpu->hyperv_passthrough) {
+        error_setg(errp,
+                   "'hv-default' and 'hv-paththrough' are mutually exclusive");
+        return;
+    }
+
+    cpu->hyperv_default = value;
+
+    /* hv-default overrides everything with the default set */
+    if (value) {
+        cpu->hyperv_features = cpu->hyperv_default_features;
+        cpu->hyperv_features_on = 0;
+        cpu->hyperv_features_off = 0;
+    }
+}
+
 /* Generic getter for "feature-words" and "filtered-features" properties */
 static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
                                       const char *name, void *opaque,
@@ -7152,6 +7185,21 @@ static void x86_cpu_initfn(Object *obj)
     if (xcc->model) {
         x86_cpu_load_model(cpu, xcc->model);
     }
+
+    /* Hyper-V features enabled with 'hv-default=on' */
+    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
+        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
+        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
+        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
+        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
+        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
+        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
+        BIT(HYPERV_FEAT_STIMER_DIRECT);
+
+    /* Enlightened VMCS is only available on Intel/VMX */
+    if (kvm_hv_evmcs_available()) {
+        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
+    }
 }
 
 static int64_t x86_cpu_get_arch_id(CPUState *cs)
@@ -7486,6 +7534,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
                                    x86_hv_passthrough_get,
                                    x86_hv_passthrough_set);
 
+    object_class_property_add_bool(oc, "hv-default",
+                              x86_hv_default_get,
+                              x86_hv_default_set);
+
     for (w = 0; w < FEATURE_WORDS; w++) {
         int bitnr;
         for (bitnr = 0; bitnr < 64; bitnr++) {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index b4fbd46f0fc9..59350e70fb51 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1670,6 +1670,9 @@ struct X86CPU {
     uint64_t hyperv_features_on;
     uint64_t hyperv_features_off;
     bool hyperv_passthrough;
+    /* 'hv-default' enablement */
+    uint64_t hyperv_default_features;
+    bool hyperv_default;
     OnOffAuto hyperv_no_nonarch_cs;
     uint32_t hyperv_vendor_id[3];
     uint32_t hyperv_interface_id[4];
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v4 21/21] qtest/hyperv: Introduce a simple hyper-v test
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (19 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 20/21] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
@ 2021-02-10 16:40 ` Vitaly Kuznetsov
  2021-02-10 16:56 ` [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Daniel P. Berrangé
  21 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-10 16:40 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost; +Cc: Paolo Bonzini, Marcelo Tosatti, Igor Mammedov

For the beginning, just test 'hv-default', 'hv-passthrough' and a couple
of custom Hyper-V enlightenments configurations through QMP. Later, it
would be great to complement this by checking CPUID values from within the
guest.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 MAINTAINERS               |   1 +
 tests/qtest/hyperv-test.c | 312 ++++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build   |   3 +-
 3 files changed, 315 insertions(+), 1 deletion(-)
 create mode 100644 tests/qtest/hyperv-test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 06635ba81a2c..0488e74473cc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1505,6 +1505,7 @@ F: hw/isa/apm.c
 F: include/hw/isa/apm.h
 F: tests/test-x86-cpuid.c
 F: tests/qtest/test-x86-cpuid-compat.c
+F: tests/qtest/hyperv-test.c
 
 PC Chipset
 M: Michael S. Tsirkin <mst@redhat.com>
diff --git a/tests/qtest/hyperv-test.c b/tests/qtest/hyperv-test.c
new file mode 100644
index 000000000000..dd2b30aa71af
--- /dev/null
+++ b/tests/qtest/hyperv-test.c
@@ -0,0 +1,312 @@
+/*
+ * Hyper-V emulation CPU feature test cases
+ *
+ * Copyright (c) 2021 Red Hat Inc.
+ * Authors:
+ *  Vitaly Kuznetsov <vkuznets@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include <linux/kvm.h>
+#include <sys/ioctl.h>
+
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qjson.h"
+
+#define MACHINE_KVM "-machine pc-q35-5.2 -accel kvm "
+#define QUERY_HEAD  "{ 'execute': 'query-cpu-model-expansion', " \
+                    "  'arguments': { 'type': 'full', "
+#define QUERY_TAIL  "}}"
+
+static bool kvm_enabled(QTestState *qts)
+{
+    QDict *resp, *qdict;
+    bool enabled;
+
+    resp = qtest_qmp(qts, "{ 'execute': 'query-kvm' }");
+    g_assert(qdict_haskey(resp, "return"));
+    qdict = qdict_get_qdict(resp, "return");
+    g_assert(qdict_haskey(qdict, "enabled"));
+    enabled = qdict_get_bool(qdict, "enabled");
+    qobject_unref(resp);
+
+    return enabled;
+}
+
+static bool kvm_has_sys_hyperv_cpuid(void)
+{
+    int fd = open("/dev/kvm", O_RDWR);
+    int ret;
+
+    g_assert(fd > 0);
+
+    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_SYS_HYPERV_CPUID);
+
+    close(fd);
+
+    return ret > 0;
+}
+
+static QDict *do_query_no_props(QTestState *qts, const char *cpu_type)
+{
+    return qtest_qmp(qts, QUERY_HEAD "'model': { 'name': %s }"
+                          QUERY_TAIL, cpu_type);
+}
+
+static bool resp_has_props(QDict *resp)
+{
+    QDict *qdict;
+
+    g_assert(resp);
+
+    if (!qdict_haskey(resp, "return")) {
+        return false;
+    }
+    qdict = qdict_get_qdict(resp, "return");
+
+    if (!qdict_haskey(qdict, "model")) {
+        return false;
+    }
+    qdict = qdict_get_qdict(qdict, "model");
+
+    return qdict_haskey(qdict, "props");
+}
+
+static QDict *resp_get_props(QDict *resp)
+{
+    QDict *qdict;
+
+    g_assert(resp);
+    g_assert(resp_has_props(resp));
+
+    qdict = qdict_get_qdict(resp, "return");
+    qdict = qdict_get_qdict(qdict, "model");
+    qdict = qdict_get_qdict(qdict, "props");
+
+    return qdict;
+}
+
+static bool resp_get_feature(QDict *resp, const char *feature)
+{
+    QDict *props;
+
+    g_assert(resp);
+    g_assert(resp_has_props(resp));
+    props = resp_get_props(resp);
+    g_assert(qdict_get(props, feature));
+    return qdict_get_bool(props, feature);
+}
+
+#define assert_has_feature(qts, cpu_type, feature)                     \
+({                                                                     \
+    QDict *_resp = do_query_no_props(qts, cpu_type);                   \
+    g_assert(_resp);                                                   \
+    g_assert(resp_has_props(_resp));                                   \
+    g_assert(qdict_get(resp_get_props(_resp), feature));               \
+    qobject_unref(_resp);                                              \
+})
+
+#define resp_assert_feature(resp, feature, expected_value)             \
+({                                                                     \
+    QDict *_props;                                                     \
+                                                                       \
+    g_assert(_resp);                                                   \
+    g_assert(resp_has_props(_resp));                                   \
+    _props = resp_get_props(_resp);                                    \
+    g_assert(qdict_get(_props, feature));                              \
+    g_assert(qdict_get_bool(_props, feature) == (expected_value));     \
+})
+
+#define assert_feature(qts, cpu_type, feature, expected_value)         \
+({                                                                     \
+    QDict *_resp;                                                      \
+                                                                       \
+    _resp = do_query_no_props(qts, cpu_type);                          \
+    g_assert(_resp);                                                   \
+    resp_assert_feature(_resp, feature, expected_value);               \
+    qobject_unref(_resp);                                              \
+})
+
+#define assert_has_feature_enabled(qts, cpu_type, feature)             \
+    assert_feature(qts, cpu_type, feature, true)
+
+#define assert_has_feature_disabled(qts, cpu_type, feature)            \
+    assert_feature(qts, cpu_type, feature, false)
+
+static void test_assert_hyperv_all_but_evmcs(QTestState *qts)
+{
+    assert_has_feature_enabled(qts, "host", "hv-relaxed");
+    assert_has_feature_enabled(qts, "host", "hv-vapic");
+    assert_has_feature_enabled(qts, "host", "hv-vpindex");
+    assert_has_feature_enabled(qts, "host", "hv-runtime");
+    assert_has_feature_enabled(qts, "host", "hv-crash");
+    assert_has_feature_enabled(qts, "host", "hv-time");
+    assert_has_feature_enabled(qts, "host", "hv-synic");
+    assert_has_feature_enabled(qts, "host", "hv-stimer");
+    assert_has_feature_enabled(qts, "host", "hv-tlbflush");
+    assert_has_feature_enabled(qts, "host", "hv-ipi");
+    assert_has_feature_enabled(qts, "host", "hv-reset");
+    assert_has_feature_enabled(qts, "host", "hv-frequencies");
+    assert_has_feature_enabled(qts, "host", "hv-reenlightenment");
+    assert_has_feature_enabled(qts, "host", "hv-stimer-direct");
+}
+
+static void test_assert_hyperv_all(QTestState *qts)
+{
+    QDict *resp;
+
+    test_assert_hyperv_all_but_evmcs(qts);
+
+    resp = do_query_no_props(qts, "host");
+    if (resp_get_feature(resp, "vmx")) {
+        assert_has_feature_enabled(qts, "host", "hv-evmcs");
+    } else {
+        assert_has_feature_disabled(qts, "host", "hv-evmcs");
+    }
+}
+
+static void test_query_cpu_hv_all_but_evmcs(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-relaxed,hv-vapic,hv-vpindex,"
+                     "hv-runtime,hv-crash,hv-time,hv-synic,hv-stimer,"
+                     "hv-tlbflush,hv-ipi,hv-reset,hv-frequencies,"
+                     "hv-reenlightenment,hv-stimer-direct");
+
+    test_assert_hyperv_all_but_evmcs(qts);
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_default(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-default");
+
+    test_assert_hyperv_all(qts);
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_default_minus(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-default,hv_ipi=off");
+
+    assert_has_feature_enabled(qts, "host", "hv-tlbflush");
+    assert_has_feature_disabled(qts, "host", "hv-ipi");
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_custom(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-vpindex");
+
+    assert_has_feature_enabled(qts, "host", "hv-vpindex");
+    assert_has_feature_disabled(qts, "host", "hv-synic");
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_passthrough(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-passthrough");
+    if (!kvm_enabled(qts)) {
+        qtest_quit(qts);
+        return;
+    }
+
+    test_assert_hyperv_all(qts);
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_passthrough_minus(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-passthrough,hv_tlbflush=off");
+    if (!kvm_enabled(qts)) {
+        qtest_quit(qts);
+        return;
+    }
+
+    assert_has_feature_enabled(qts, "host", "hv-vpindex");
+    assert_has_feature_disabled(qts, "host", "hv-tlbflush");
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_evmcs_novmx_default(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,-vmx,hv-default");
+    if (!kvm_enabled(qts)) {
+        qtest_quit(qts);
+        return;
+    }
+
+    assert_has_feature_disabled(qts, "host", "vmx");
+    assert_has_feature_disabled(qts, "host", "hv-evmcs");
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_evmcs_novmx_passthrough(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,-vmx,hv-passthrough");
+    if (!kvm_enabled(qts)) {
+        qtest_quit(qts);
+        return;
+    }
+
+    assert_has_feature_disabled(qts, "host", "vmx");
+    assert_has_feature_disabled(qts, "host", "hv-evmcs");
+
+    qtest_quit(qts);
+}
+
+int main(int argc, char **argv)
+{
+    const char *arch = qtest_get_arch();
+
+    g_test_init(&argc, &argv, NULL);
+
+    if (!strcmp(arch, "i386") || !strcmp(arch, "x86_64")) {
+        qtest_add_data_func("/hyperv/hv-all-but-evmcs",
+                            NULL, test_query_cpu_hv_all_but_evmcs);
+        qtest_add_data_func("/hyperv/hv-default",
+                            NULL, test_query_cpu_hv_default);
+        qtest_add_data_func("/hyperv/hv-default-minus",
+                            NULL, test_query_cpu_hv_default_minus);
+        qtest_add_data_func("/hyperv/hv-custom",
+                            NULL, test_query_cpu_hv_custom);
+        if (kvm_has_sys_hyperv_cpuid()) {
+            qtest_add_data_func("/hyperv/hv-passthrough",
+                                NULL, test_query_cpu_hv_passthrough);
+            qtest_add_data_func("/hyperv/hv-passthrough-minus",
+                                NULL, test_query_cpu_hv_passthrough_minus);
+            qtest_add_data_func("/hyperv/hv-evmcs-novmx-default", NULL,
+                                test_query_cpu_hv_evmcs_novmx_default);
+            qtest_add_data_func("/hyperv/hv-evmcs-novmx-passthrough", NULL,
+                                test_query_cpu_hv_evmcs_novmx_passthrough);
+       }
+    }
+
+    return g_test_run();
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index c83bc211b6a6..13dd65a8bf3f 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -65,7 +65,8 @@ qtests_i386 = \
    'vmgenid-test',
    'migration-test',
    'test-x86-cpuid-compat',
-   'numa-test']
+   'numa-test',
+   'hyperv-test']
 
 dbus_daemon = find_program('dbus-daemon', required: false)
 if dbus_daemon.found() and config_host.has_key('GDBUS_CODEGEN')
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
  2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (20 preceding siblings ...)
  2021-02-10 16:40 ` [PATCH v4 21/21] qtest/hyperv: Introduce a simple hyper-v test Vitaly Kuznetsov
@ 2021-02-10 16:56 ` Daniel P. Berrangé
  2021-02-10 17:46   ` Eduardo Habkost
  21 siblings, 1 reply; 58+ messages in thread
From: Daniel P. Berrangé @ 2021-02-10 16:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Igor Mammedov, Marcelo Tosatti, qemu-devel,
	Eduardo Habkost

On Wed, Feb 10, 2021 at 05:40:12PM +0100, Vitaly Kuznetsov wrote:
> Changes since v3:
> - Make 'hv-default' override 'hv-*' options which were already set 
>   (e.g. 'hv-feature=on,hv-default' case) [Igor]. Make 'hv-passthrough'
>   behave the same way.
> - Add "i386: be more picky about implicit 'hv-evmcs' enablement" patch to avoid
>   enabling 'hv-evmcs' with hv-default/hv-passthrough when guest CPU lacks VMX.
> - Add "i386: support 'hv-passthrough,hv-feature=off' on the command line" patch
>   to make 'hv-passthrough' semantics match the newly introduced 'hv-default'.
> - Add "i386: track explicit 'hv-*' features enablement/disablement" patch to
>   support the above mentioned changes.
> - Expand qtest to check the above mentioned improvements.
> 
> Original description:
> 
> Upper layer tools like libvirt want to figure out which Hyper-V features are
> supported by the underlying stack (QEMU/KVM) but currently they are unable to
> do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
> no effect on e.g. QMP's 
> 
> query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}
> 
> command as we parse Hyper-V features after creating KVM vCPUs and not at
> feature expansion time. To support the use-case we first need to make 
> KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
> vCPU version can't be used that early. This is what KVM part does. With
> that done, we can make early Hyper-V feature expansion (this series).
> 
> In addition, provide a simple 'hv-default' option which enables (and
> requires from KVM) all currently supported Hyper-V enlightenments.
> Unlike 'hv-passthrough' mode, this is going to be migratable.

How is it going to be migratable if the semantics vary depending on
the host kernel KVM reporting features, because different kernels
will expose different features ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
  2021-02-10 16:56 ` [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Daniel P. Berrangé
@ 2021-02-10 17:46   ` Eduardo Habkost
  2021-02-11  8:30     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Eduardo Habkost @ 2021-02-10 17:46 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Marcelo Tosatti, qemu-devel,
	Igor Mammedov

On Wed, Feb 10, 2021 at 04:56:06PM +0000, Daniel P. Berrangé wrote:
> On Wed, Feb 10, 2021 at 05:40:12PM +0100, Vitaly Kuznetsov wrote:
> > Changes since v3:
> > - Make 'hv-default' override 'hv-*' options which were already set 
> >   (e.g. 'hv-feature=on,hv-default' case) [Igor]. Make 'hv-passthrough'
> >   behave the same way.
> > - Add "i386: be more picky about implicit 'hv-evmcs' enablement" patch to avoid
> >   enabling 'hv-evmcs' with hv-default/hv-passthrough when guest CPU lacks VMX.
> > - Add "i386: support 'hv-passthrough,hv-feature=off' on the command line" patch
> >   to make 'hv-passthrough' semantics match the newly introduced 'hv-default'.
> > - Add "i386: track explicit 'hv-*' features enablement/disablement" patch to
> >   support the above mentioned changes.
> > - Expand qtest to check the above mentioned improvements.
> > 
> > Original description:
> > 
> > Upper layer tools like libvirt want to figure out which Hyper-V features are
> > supported by the underlying stack (QEMU/KVM) but currently they are unable to
> > do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
> > no effect on e.g. QMP's 
> > 
> > query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}
> > 
> > command as we parse Hyper-V features after creating KVM vCPUs and not at
> > feature expansion time. To support the use-case we first need to make 
> > KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
> > vCPU version can't be used that early. This is what KVM part does. With
> > that done, we can make early Hyper-V feature expansion (this series).
> > 
> > In addition, provide a simple 'hv-default' option which enables (and
> > requires from KVM) all currently supported Hyper-V enlightenments.
> > Unlike 'hv-passthrough' mode, this is going to be migratable.
> 
> How is it going to be migratable if the semantics vary depending on
> the host kernel KVM reporting features, because different kernels
> will expose different features ?

"all currently supported" in this context means "all features
supported when the machine type was added", not "all features
supported by the host kernel".

-- 
Eduardo



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
  2021-02-10 17:46   ` Eduardo Habkost
@ 2021-02-11  8:30     ` Vitaly Kuznetsov
  2021-02-11  9:14       ` Daniel P. Berrangé
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-11  8:30 UTC (permalink / raw)
  To: Eduardo Habkost, Daniel P. Berrangé
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Igor Mammedov

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Wed, Feb 10, 2021 at 04:56:06PM +0000, Daniel P. Berrangé wrote:
>> On Wed, Feb 10, 2021 at 05:40:12PM +0100, Vitaly Kuznetsov wrote:
>> > Changes since v3:
>> > - Make 'hv-default' override 'hv-*' options which were already set 
>> >   (e.g. 'hv-feature=on,hv-default' case) [Igor]. Make 'hv-passthrough'
>> >   behave the same way.
>> > - Add "i386: be more picky about implicit 'hv-evmcs' enablement" patch to avoid
>> >   enabling 'hv-evmcs' with hv-default/hv-passthrough when guest CPU lacks VMX.
>> > - Add "i386: support 'hv-passthrough,hv-feature=off' on the command line" patch
>> >   to make 'hv-passthrough' semantics match the newly introduced 'hv-default'.
>> > - Add "i386: track explicit 'hv-*' features enablement/disablement" patch to
>> >   support the above mentioned changes.
>> > - Expand qtest to check the above mentioned improvements.
>> > 
>> > Original description:
>> > 
>> > Upper layer tools like libvirt want to figure out which Hyper-V features are
>> > supported by the underlying stack (QEMU/KVM) but currently they are unable to
>> > do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
>> > no effect on e.g. QMP's 
>> > 
>> > query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}
>> > 
>> > command as we parse Hyper-V features after creating KVM vCPUs and not at
>> > feature expansion time. To support the use-case we first need to make 
>> > KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
>> > vCPU version can't be used that early. This is what KVM part does. With
>> > that done, we can make early Hyper-V feature expansion (this series).
>> > 
>> > In addition, provide a simple 'hv-default' option which enables (and
>> > requires from KVM) all currently supported Hyper-V enlightenments.
>> > Unlike 'hv-passthrough' mode, this is going to be migratable.
>> 
>> How is it going to be migratable if the semantics vary depending on
>> the host kernel KVM reporting features, because different kernels
>> will expose different features ?
>
> "all currently supported" in this context means "all features
> supported when the machine type was added", not "all features
> supported by the host kernel".

Yes, exactly.

'hv-passthrough' enables 'everything supported by the host' and this is
not migratable.

'hv-default' requires a certain set of features (depending on the
machine type) so the VM won't start if the host lacks something.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
  2021-02-11  8:30     ` Vitaly Kuznetsov
@ 2021-02-11  9:14       ` Daniel P. Berrangé
  2021-02-11  9:34         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Daniel P. Berrangé @ 2021-02-11  9:14 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Igor Mammedov, Marcelo Tosatti, Eduardo Habkost,
	qemu-devel

On Thu, Feb 11, 2021 at 09:30:53AM +0100, Vitaly Kuznetsov wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > On Wed, Feb 10, 2021 at 04:56:06PM +0000, Daniel P. Berrangé wrote:
> >> On Wed, Feb 10, 2021 at 05:40:12PM +0100, Vitaly Kuznetsov wrote:
> >> > Changes since v3:
> >> > - Make 'hv-default' override 'hv-*' options which were already set 
> >> >   (e.g. 'hv-feature=on,hv-default' case) [Igor]. Make 'hv-passthrough'
> >> >   behave the same way.
> >> > - Add "i386: be more picky about implicit 'hv-evmcs' enablement" patch to avoid
> >> >   enabling 'hv-evmcs' with hv-default/hv-passthrough when guest CPU lacks VMX.
> >> > - Add "i386: support 'hv-passthrough,hv-feature=off' on the command line" patch
> >> >   to make 'hv-passthrough' semantics match the newly introduced 'hv-default'.
> >> > - Add "i386: track explicit 'hv-*' features enablement/disablement" patch to
> >> >   support the above mentioned changes.
> >> > - Expand qtest to check the above mentioned improvements.
> >> > 
> >> > Original description:
> >> > 
> >> > Upper layer tools like libvirt want to figure out which Hyper-V features are
> >> > supported by the underlying stack (QEMU/KVM) but currently they are unable to
> >> > do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
> >> > no effect on e.g. QMP's 
> >> > 
> >> > query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}
> >> > 
> >> > command as we parse Hyper-V features after creating KVM vCPUs and not at
> >> > feature expansion time. To support the use-case we first need to make 
> >> > KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
> >> > vCPU version can't be used that early. This is what KVM part does. With
> >> > that done, we can make early Hyper-V feature expansion (this series).
> >> > 
> >> > In addition, provide a simple 'hv-default' option which enables (and
> >> > requires from KVM) all currently supported Hyper-V enlightenments.
> >> > Unlike 'hv-passthrough' mode, this is going to be migratable.
> >> 
> >> How is it going to be migratable if the semantics vary depending on
> >> the host kernel KVM reporting features, because different kernels
> >> will expose different features ?
> >
> > "all currently supported" in this context means "all features
> > supported when the machine type was added", not "all features
> > supported by the host kernel".
> 
> Yes, exactly.
> 
> 'hv-passthrough' enables 'everything supported by the host' and this is
> not migratable.
> 
> 'hv-default' requires a certain set of features (depending on the
> machine type) so the VM won't start if the host lacks something.

Ok, so I presume HV features will only be added to hv-default when we
know they are available in the oldest kernel we are targetting ? Upsteam
is more conservative in this respect than downstreams,  the latter can
guarantee much more modern kernels.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
  2021-02-11  9:14       ` Daniel P. Berrangé
@ 2021-02-11  9:34         ` Vitaly Kuznetsov
  2021-02-11 10:14           ` Daniel P. Berrangé
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-11  9:34 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Igor Mammedov, Marcelo Tosatti, Eduardo Habkost,
	qemu-devel

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, Feb 11, 2021 at 09:30:53AM +0100, Vitaly Kuznetsov wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>> 
>> > On Wed, Feb 10, 2021 at 04:56:06PM +0000, Daniel P. Berrangé wrote:
>> >> On Wed, Feb 10, 2021 at 05:40:12PM +0100, Vitaly Kuznetsov wrote:
>> >> > Changes since v3:
>> >> > - Make 'hv-default' override 'hv-*' options which were already set 
>> >> >   (e.g. 'hv-feature=on,hv-default' case) [Igor]. Make 'hv-passthrough'
>> >> >   behave the same way.
>> >> > - Add "i386: be more picky about implicit 'hv-evmcs' enablement" patch to avoid
>> >> >   enabling 'hv-evmcs' with hv-default/hv-passthrough when guest CPU lacks VMX.
>> >> > - Add "i386: support 'hv-passthrough,hv-feature=off' on the command line" patch
>> >> >   to make 'hv-passthrough' semantics match the newly introduced 'hv-default'.
>> >> > - Add "i386: track explicit 'hv-*' features enablement/disablement" patch to
>> >> >   support the above mentioned changes.
>> >> > - Expand qtest to check the above mentioned improvements.
>> >> > 
>> >> > Original description:
>> >> > 
>> >> > Upper layer tools like libvirt want to figure out which Hyper-V features are
>> >> > supported by the underlying stack (QEMU/KVM) but currently they are unable to
>> >> > do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
>> >> > no effect on e.g. QMP's 
>> >> > 
>> >> > query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}
>> >> > 
>> >> > command as we parse Hyper-V features after creating KVM vCPUs and not at
>> >> > feature expansion time. To support the use-case we first need to make 
>> >> > KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
>> >> > vCPU version can't be used that early. This is what KVM part does. With
>> >> > that done, we can make early Hyper-V feature expansion (this series).
>> >> > 
>> >> > In addition, provide a simple 'hv-default' option which enables (and
>> >> > requires from KVM) all currently supported Hyper-V enlightenments.
>> >> > Unlike 'hv-passthrough' mode, this is going to be migratable.
>> >> 
>> >> How is it going to be migratable if the semantics vary depending on
>> >> the host kernel KVM reporting features, because different kernels
>> >> will expose different features ?
>> >
>> > "all currently supported" in this context means "all features
>> > supported when the machine type was added", not "all features
>> > supported by the host kernel".
>> 
>> Yes, exactly.
>> 
>> 'hv-passthrough' enables 'everything supported by the host' and this is
>> not migratable.
>> 
>> 'hv-default' requires a certain set of features (depending on the
>> machine type) so the VM won't start if the host lacks something.
>
> Ok, so I presume HV features will only be added to hv-default when we
> know they are available in the oldest kernel we are targetting ? Upsteam
> is more conservative in this respect than downstreams,  the latter can
> guarantee much more modern kernels.
>

Yes, it is kind of an open question when a feature gets 'promoted' to
'hv-default'. Currently, the latest feature we include is
'HYPERV_FEAT_STIMER_DIRECT' which dates back to Linux 5.0. It is also
possible to use something like

'hv-default,hv-stimer-direct=off,...'

when running on an older kernel (and this is still migratable).

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
  2021-02-11  9:34         ` Vitaly Kuznetsov
@ 2021-02-11 10:14           ` Daniel P. Berrangé
  0 siblings, 0 replies; 58+ messages in thread
From: Daniel P. Berrangé @ 2021-02-11 10:14 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: qemu-devel, Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost,
	Igor Mammedov

On Thu, Feb 11, 2021 at 10:34:15AM +0100, Vitaly Kuznetsov wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Thu, Feb 11, 2021 at 09:30:53AM +0100, Vitaly Kuznetsov wrote:
> >> Eduardo Habkost <ehabkost@redhat.com> writes:
> >> 
> >> > On Wed, Feb 10, 2021 at 04:56:06PM +0000, Daniel P. Berrangé wrote:
> >> >> On Wed, Feb 10, 2021 at 05:40:12PM +0100, Vitaly Kuznetsov wrote:
> >> >> > Changes since v3:
> >> >> > - Make 'hv-default' override 'hv-*' options which were already set 
> >> >> >   (e.g. 'hv-feature=on,hv-default' case) [Igor]. Make 'hv-passthrough'
> >> >> >   behave the same way.
> >> >> > - Add "i386: be more picky about implicit 'hv-evmcs' enablement" patch to avoid
> >> >> >   enabling 'hv-evmcs' with hv-default/hv-passthrough when guest CPU lacks VMX.
> >> >> > - Add "i386: support 'hv-passthrough,hv-feature=off' on the command line" patch
> >> >> >   to make 'hv-passthrough' semantics match the newly introduced 'hv-default'.
> >> >> > - Add "i386: track explicit 'hv-*' features enablement/disablement" patch to
> >> >> >   support the above mentioned changes.
> >> >> > - Expand qtest to check the above mentioned improvements.
> >> >> > 
> >> >> > Original description:
> >> >> > 
> >> >> > Upper layer tools like libvirt want to figure out which Hyper-V features are
> >> >> > supported by the underlying stack (QEMU/KVM) but currently they are unable to
> >> >> > do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
> >> >> > no effect on e.g. QMP's 
> >> >> > 
> >> >> > query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}
> >> >> > 
> >> >> > command as we parse Hyper-V features after creating KVM vCPUs and not at
> >> >> > feature expansion time. To support the use-case we first need to make 
> >> >> > KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
> >> >> > vCPU version can't be used that early. This is what KVM part does. With
> >> >> > that done, we can make early Hyper-V feature expansion (this series).
> >> >> > 
> >> >> > In addition, provide a simple 'hv-default' option which enables (and
> >> >> > requires from KVM) all currently supported Hyper-V enlightenments.
> >> >> > Unlike 'hv-passthrough' mode, this is going to be migratable.
> >> >> 
> >> >> How is it going to be migratable if the semantics vary depending on
> >> >> the host kernel KVM reporting features, because different kernels
> >> >> will expose different features ?
> >> >
> >> > "all currently supported" in this context means "all features
> >> > supported when the machine type was added", not "all features
> >> > supported by the host kernel".
> >> 
> >> Yes, exactly.
> >> 
> >> 'hv-passthrough' enables 'everything supported by the host' and this is
> >> not migratable.
> >> 
> >> 'hv-default' requires a certain set of features (depending on the
> >> machine type) so the VM won't start if the host lacks something.
> >
> > Ok, so I presume HV features will only be added to hv-default when we
> > know they are available in the oldest kernel we are targetting ? Upsteam
> > is more conservative in this respect than downstreams,  the latter can
> > guarantee much more modern kernels.
> >
> 
> Yes, it is kind of an open question when a feature gets 'promoted' to
> 'hv-default'. Currently, the latest feature we include is
> 'HYPERV_FEAT_STIMER_DIRECT' which dates back to Linux 5.0. It is also
> possible to use something like

Upstream we have a clear set of targetted OS platforms:

  https://qemu.readthedocs.io/en/latest/system/build-platforms.html

This will inform what our minimum possible kernel version will be,
which should influence when we can promote something.

Downstream's of course have their own min kernel model, so they can
do things differently if they have their own machine types.

> 'hv-default,hv-stimer-direct=off,...'
> 
> when running on an older kernel (and this is still migratable).

At that point the mgmt app needs to know exactly what features the
host supports, and if they know that, there's no real need to use
hv-default in the first place. IOW, I think we need to strive to
ensure "hv-default" is always usable on supported platforms without
needing to know about turning off things.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line
  2021-02-10 16:40 ` [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line Vitaly Kuznetsov
@ 2021-02-11 17:14   ` Igor Mammedov
  2021-02-12  8:49     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-11 17:14 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Wed, 10 Feb 2021 17:40:29 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Currently, we support 'hv-passthrough,hv-feature=on' enablement, this
> is supposed to mean "hv-feature is mandatory, don't start without it". Add
> support for 'hv-passthrough,hv-feature=off' meaning "enable everything
> supported by the host except for hv-feature".
> 
> While on it, make 'hv-passthrough' parse semantics in-line with other
> options in qemu: when specified, it overrides what was previously set with
> what's supported by the host. This can later be modified with 'hv-feature=on'/
> 'hv-feature=off'.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  target/i386/cpu.c     | 28 +++++++++++++++++++++++++++-
>  target/i386/kvm/kvm.c |  4 ++++
>  2 files changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index e8a004c39d04..f8df2caed779 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4725,6 +4725,29 @@ static void x86_hv_stimer_direct_set(Object *obj, bool value, Error **errp)
>      x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER_DIRECT);
>  }
>  
> +static bool x86_hv_passthrough_get(Object *obj, Error **errp)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    return cpu->hyperv_passthrough;
> +}
> +
> +static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    cpu->hyperv_passthrough = value;
> +
> +    /* hv-passthrough overrides everything with what's supported by the host */
> +    if (value) {
> +        cpu->hyperv_features = 0;
> +        cpu->hyperv_features_on = 0;
> +        cpu->hyperv_features_off = 0;

why do we have _on|_off fields?

> +    }
> +
> +    return;
> +}
> +
>  /* Generic getter for "feature-words" and "filtered-features" properties */
>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>                                        const char *name, void *opaque,
> @@ -7281,7 +7304,6 @@ static Property x86_cpu_properties[] = {
>                         HYPERV_SPINLOCK_NEVER_NOTIFY),
>      DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
>                              hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
> -    DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
>  
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
> @@ -7460,6 +7482,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>                                     x86_hv_stimer_direct_get,
>                                     x86_hv_stimer_direct_set);
>  
> +    object_class_property_add_bool(oc, "hv-passthrough",
> +                                   x86_hv_passthrough_get,
> +                                   x86_hv_passthrough_set);
> +
>      for (w = 0; w < FEATURE_WORDS; w++) {
>          int bitnr;
>          for (bitnr = 0; bitnr < 64; bitnr++) {
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 30013f0d7cee..fca088d4d3b5 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1153,6 +1153,10 @@ static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
>          return 0;
>      }
>  
> +    if (cpu->hyperv_passthrough && (cpu->hyperv_features_off & BIT(feature))) {
> +        return 0;
> +    }
> +
>      deps = kvm_hyperv_properties[feature].dependencies;
>      while (deps) {
>          dep_feat = ctz64(deps);



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 20/21] i386: provide simple 'hv-default=on' option
  2021-02-10 16:40 ` [PATCH v4 20/21] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
@ 2021-02-11 17:23   ` Igor Mammedov
  2021-02-12  8:52     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-11 17:23 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Wed, 10 Feb 2021 17:40:32 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> requires listing all currently supported enlightenments ("hv-*" CPU
> features) explicitly. We do have 'hv-passthrough' mode enabling
> everything but it can't be used in production as it prevents migration.
> 
> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> compat_props mechanism will be used to disable them for legacy machine types,
> this will keep 'hv-default=on' configurations migratable.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  docs/hyperv.txt   | 16 ++++++++++++---
>  target/i386/cpu.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++
>  target/i386/cpu.h |  3 +++
>  3 files changed, 68 insertions(+), 3 deletions(-)
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> index 5df00da54fc4..a54c066cab09 100644
> --- a/docs/hyperv.txt
> +++ b/docs/hyperv.txt
> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
>  
>  2. Setup
>  =========
> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> +All currently supported Hyper-V enlightenments can be enabled by specifying
> +'hv-default=on' CPU flag:
>  
> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> +
> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> +e.g:
> +
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...
> +
> +It is also possible to disable individual enlightenments from the default list,
> +this can be used for debugging purposes:
> +
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
>  
>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
>  check that the supplied configuration is sane.
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f8df2caed779..013aa60272d8 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4736,6 +4736,12 @@ static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
>  {
>      X86CPU *cpu = X86_CPU(obj);
>  
> +    if (cpu->hyperv_default) {
> +        error_setg(errp,
> +                   "'hv-default' and 'hv-paththrough' are mutually exclusive");
> +        return;
> +    }
> +
>      cpu->hyperv_passthrough = value;
>  
>      /* hv-passthrough overrides everything with what's supported by the host */
> @@ -4748,6 +4754,33 @@ static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
>      return;
>  }
>  
> +static bool x86_hv_default_get(Object *obj, Error **errp)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    return cpu->hyperv_default;
> +}
> +
> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    if (cpu->hyperv_passthrough) {
> +        error_setg(errp,
> +                   "'hv-default' and 'hv-paththrough' are mutually exclusive");
this check will work only half way, i.e.: hv-paththrough=on,hv-default=on|off
(where off value looks a bit wierd)
but not other way around: hv-default=on,hv-paththrough=on

were you thinking about following error:
  "hv-default can't be used after hv-paththrough were enabled"

or if it symmetric, then putting this check in realizefn() will do the job
as both properties are processed by that time.

> +        return;
> +    }
> +
> +    cpu->hyperv_default = value;
> +
> +    /* hv-default overrides everything with the default set */
> +    if (value) {
> +        cpu->hyperv_features = cpu->hyperv_default_features;
> +        cpu->hyperv_features_on = 0;
> +        cpu->hyperv_features_off = 0;
> +    }
> +}
> +
>  /* Generic getter for "feature-words" and "filtered-features" properties */
>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>                                        const char *name, void *opaque,
> @@ -7152,6 +7185,21 @@ static void x86_cpu_initfn(Object *obj)
>      if (xcc->model) {
>          x86_cpu_load_model(cpu, xcc->model);
>      }
> +
> +    /* Hyper-V features enabled with 'hv-default=on' */
> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
> +
> +    /* Enlightened VMCS is only available on Intel/VMX */
> +    if (kvm_hv_evmcs_available()) {
> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
> +    }
>  }
>  
>  static int64_t x86_cpu_get_arch_id(CPUState *cs)
> @@ -7486,6 +7534,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>                                     x86_hv_passthrough_get,
>                                     x86_hv_passthrough_set);
>  
> +    object_class_property_add_bool(oc, "hv-default",
> +                              x86_hv_default_get,
> +                              x86_hv_default_set);
> +
>      for (w = 0; w < FEATURE_WORDS; w++) {
>          int bitnr;
>          for (bitnr = 0; bitnr < 64; bitnr++) {
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index b4fbd46f0fc9..59350e70fb51 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1670,6 +1670,9 @@ struct X86CPU {
>      uint64_t hyperv_features_on;
>      uint64_t hyperv_features_off;
>      bool hyperv_passthrough;
> +    /* 'hv-default' enablement */
> +    uint64_t hyperv_default_features;
> +    bool hyperv_default;
>      OnOffAuto hyperv_no_nonarch_cs;
>      uint32_t hyperv_vendor_id[3];
>      uint32_t hyperv_interface_id[4];



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-10 16:40 ` [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement Vitaly Kuznetsov
@ 2021-02-11 17:35   ` Igor Mammedov
  2021-02-12  8:45     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-11 17:35 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Wed, 10 Feb 2021 17:40:28 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Sometimes we'd like to know which features were explicitly enabled and which
> were explicitly disabled on the command line. E.g. it seems logical to handle
> 'hv_passthrough,hv_feature=off' as "enable everything supported by the host
> except for hv_feature" but this doesn't seem to be possible with the current
> 'hyperv_features' bit array. Introduce 'hv_features_on'/'hv_features_off'
> add-ons and track explicit enablement/disablement there.
> 
> Note, it doesn't seem to be possible to fill 'hyperv_features' array during
> CPU creation time when 'hv-passthrough' is specified and we're running on
> an older kernel without KVM_CAP_SYS_HYPERV_CPUID support. To get the list
> of the supported Hyper-V features we need to actually create KVM VCPU and
> this happens much later.

seems to me that we are returning back to +-feat parsing, this time only for
hyperv.
I'm not sure I like it back, especially considering we are going to
drop "-feat" priority for x86.

now about impossible, see arm/kvm/virt, they create a 'sample' VCPU at KVM
init time to probe for some CPU features in advance. You can use similar
approach to prepare value for hyperv_features.

> 
> No functional change intended.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  target/i386/cpu.c | 237 ++++++++++++++++++++++++++++++++++++++++------
>  target/i386/cpu.h |   2 +
>  2 files changed, 209 insertions(+), 30 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c4e8863c7ca0..e8a004c39d04 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4553,6 +4553,178 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
>  }
>  
> +static bool x86_hv_feature_get(Object *obj, int feature)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    return cpu->hyperv_features & BIT(feature);
> +}
> +
> +static void x86_hv_feature_set(Object *obj, bool value, int feature)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    if (value) {
> +        cpu->hyperv_features |= BIT(feature);
> +        cpu->hyperv_features_on |= BIT(feature);
> +        cpu->hyperv_features_off &= ~BIT(feature);
> +    } else {
> +        cpu->hyperv_features &= ~BIT(feature);
> +        cpu->hyperv_features_on &= ~BIT(feature);
> +        cpu->hyperv_features_off |= BIT(feature);
> +    }
> +}
> +
> +static bool x86_hv_relaxed_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_RELAXED);
> +}
> +
> +static void x86_hv_relaxed_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_RELAXED);
> +}
> +
> +static bool x86_hv_vapic_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_VAPIC);
> +}
> +
> +static void x86_hv_vapic_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_VAPIC);
> +}
> +
> +static bool x86_hv_time_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_TIME);
> +}
> +
> +static void x86_hv_time_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_TIME);
> +}
> +
> +static bool x86_hv_crash_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_CRASH);
> +}
> +
> +static void x86_hv_crash_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_CRASH);
> +}
> +
> +static bool x86_hv_reset_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_RESET);
> +}
> +
> +static void x86_hv_reset_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_RESET);
> +}
> +
> +static bool x86_hv_vpindex_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_VPINDEX);
> +}
> +
> +static void x86_hv_vpindex_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_VPINDEX);
> +}
> +
> +static bool x86_hv_runtime_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_RUNTIME);
> +}
> +
> +static void x86_hv_runtime_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_RUNTIME);
> +}
> +
> +static bool x86_hv_synic_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_SYNIC);
> +}
> +
> +static void x86_hv_synic_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_SYNIC);
> +}
> +
> +static bool x86_hv_stimer_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_STIMER);
> +}
> +
> +static void x86_hv_stimer_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER);
> +}
> +
> +static bool x86_hv_frequencies_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_FREQUENCIES);
> +}
> +
> +static void x86_hv_frequencies_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_FREQUENCIES);
> +}
> +
> +static bool x86_hv_reenlightenment_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_REENLIGHTENMENT);
> +}
> +
> +static void x86_hv_reenlightenment_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_REENLIGHTENMENT);
> +}
> +
> +static bool x86_hv_tlbflush_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_TLBFLUSH);
> +}
> +
> +static void x86_hv_tlbflush_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_TLBFLUSH);
> +}
> +
> +static bool x86_hv_evmcs_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_EVMCS);
> +}
> +
> +static void x86_hv_evmcs_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_EVMCS);
> +}
> +
> +static bool x86_hv_ipi_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_IPI);
> +}
> +
> +static void x86_hv_ipi_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_IPI);
> +}
> +
> +static bool x86_hv_stimer_direct_get(Object *obj, Error **errp)
> +{
> +    return x86_hv_feature_get(obj, HYPERV_FEAT_STIMER_DIRECT);
> +}
> +
> +static void x86_hv_stimer_direct_set(Object *obj, bool value, Error **errp)
> +{
> +    x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER_DIRECT);
> +}
> +
>  /* Generic getter for "feature-words" and "filtered-features" properties */
>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>                                        const char *name, void *opaque,
> @@ -7107,36 +7279,6 @@ static Property x86_cpu_properties[] = {
>  
>      DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
>                         HYPERV_SPINLOCK_NEVER_NOTIFY),
> -    DEFINE_PROP_BIT64("hv-relaxed", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_RELAXED, 0),
> -    DEFINE_PROP_BIT64("hv-vapic", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_VAPIC, 0),
> -    DEFINE_PROP_BIT64("hv-time", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_TIME, 0),
> -    DEFINE_PROP_BIT64("hv-crash", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_CRASH, 0),
> -    DEFINE_PROP_BIT64("hv-reset", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_RESET, 0),
> -    DEFINE_PROP_BIT64("hv-vpindex", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_VPINDEX, 0),
> -    DEFINE_PROP_BIT64("hv-runtime", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_RUNTIME, 0),
> -    DEFINE_PROP_BIT64("hv-synic", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_SYNIC, 0),
> -    DEFINE_PROP_BIT64("hv-stimer", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_STIMER, 0),
> -    DEFINE_PROP_BIT64("hv-frequencies", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_FREQUENCIES, 0),
> -    DEFINE_PROP_BIT64("hv-reenlightenment", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_REENLIGHTENMENT, 0),
> -    DEFINE_PROP_BIT64("hv-tlbflush", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_TLBFLUSH, 0),
> -    DEFINE_PROP_BIT64("hv-evmcs", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_EVMCS, 0),
> -    DEFINE_PROP_BIT64("hv-ipi", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_IPI, 0),
> -    DEFINE_PROP_BIT64("hv-stimer-direct", X86CPU, hyperv_features,
> -                      HYPERV_FEAT_STIMER_DIRECT, 0),
>      DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
>                              hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
>      DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
> @@ -7283,6 +7425,41 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>                                x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
>  #endif
>  
> +    object_class_property_add_bool(oc, "hv-relaxed",
> +                                   x86_hv_relaxed_get, x86_hv_relaxed_set);
> +    object_class_property_add_bool(oc, "hv-vapic",
> +                                   x86_hv_vapic_get, x86_hv_vapic_set);
> +    object_class_property_add_bool(oc, "hv-time",
> +                                   x86_hv_time_get, x86_hv_time_set);
> +    object_class_property_add_bool(oc, "hv-crash",
> +                                   x86_hv_crash_get, x86_hv_crash_set);
> +    object_class_property_add_bool(oc, "hv-reset",
> +                                   x86_hv_reset_get, x86_hv_reset_set);
> +    object_class_property_add_bool(oc, "hv-vpindex",
> +                                   x86_hv_vpindex_get, x86_hv_vpindex_set);
> +    object_class_property_add_bool(oc, "hv-runtime",
> +                                   x86_hv_runtime_get, x86_hv_runtime_set);
> +    object_class_property_add_bool(oc, "hv-synic",
> +                                   x86_hv_synic_get, x86_hv_synic_set);
> +    object_class_property_add_bool(oc, "hv-stimer",
> +                                   x86_hv_stimer_get, x86_hv_stimer_set);
> +    object_class_property_add_bool(oc, "hv-frequencies",
> +                                   x86_hv_frequencies_get,
> +                                   x86_hv_frequencies_set);
> +    object_class_property_add_bool(oc, "hv-reenlightenment",
> +                                   x86_hv_reenlightenment_get,
> +                                   x86_hv_reenlightenment_set);
> +    object_class_property_add_bool(oc, "hv-tlbflush",
> +                                   x86_hv_tlbflush_get, x86_hv_tlbflush_set);
> +    object_class_property_add_bool(oc, "hv-evmcs",
> +                              x86_hv_evmcs_get,
> +                              x86_hv_evmcs_set);
> +    object_class_property_add_bool(oc, "hv-ipi",
> +                                   x86_hv_ipi_get, x86_hv_ipi_set);
> +    object_class_property_add_bool(oc, "hv-stimer-direct",
> +                                   x86_hv_stimer_direct_get,
> +                                   x86_hv_stimer_direct_set);
> +
>      for (w = 0; w < FEATURE_WORDS; w++) {
>          int bitnr;
>          for (bitnr = 0; bitnr < 64; bitnr++) {
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 7ea14822aab5..b4fbd46f0fc9 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1667,6 +1667,8 @@ struct X86CPU {
>      char *hyperv_vendor;
>      bool hyperv_synic_kvm_only;
>      uint64_t hyperv_features;
> +    uint64_t hyperv_features_on;
> +    uint64_t hyperv_features_off;
>      bool hyperv_passthrough;
>      OnOffAuto hyperv_no_nonarch_cs;
>      uint32_t hyperv_vendor_id[3];



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-11 17:35   ` Igor Mammedov
@ 2021-02-12  8:45     ` Vitaly Kuznetsov
  2021-02-12 14:12       ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-12  8:45 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Wed, 10 Feb 2021 17:40:28 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Sometimes we'd like to know which features were explicitly enabled and which
>> were explicitly disabled on the command line. E.g. it seems logical to handle
>> 'hv_passthrough,hv_feature=off' as "enable everything supported by the host
>> except for hv_feature" but this doesn't seem to be possible with the current
>> 'hyperv_features' bit array. Introduce 'hv_features_on'/'hv_features_off'
>> add-ons and track explicit enablement/disablement there.
>> 
>> Note, it doesn't seem to be possible to fill 'hyperv_features' array during
>> CPU creation time when 'hv-passthrough' is specified and we're running on
>> an older kernel without KVM_CAP_SYS_HYPERV_CPUID support. To get the list
>> of the supported Hyper-V features we need to actually create KVM VCPU and
>> this happens much later.
>
> seems to me that we are returning back to +-feat parsing, this time only for
> hyperv.
> I'm not sure I like it back, especially considering we are going to
> drop "-feat" priority for x86.
>
> now about impossible, see arm/kvm/virt, they create a 'sample' VCPU at KVM
> init time to probe for some CPU features in advance. You can use similar
> approach to prepare value for hyperv_features.
>

KVM_CAP_SYS_HYPERV_CPUID is supported since 5.11 and eventually it'll
make it to all kernels we care about so I'd really like to avoid any
'sample' CPUs for the time being. On/off parsing looks like a much
lesser evil.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line
  2021-02-11 17:14   ` Igor Mammedov
@ 2021-02-12  8:49     ` Vitaly Kuznetsov
  2021-02-12  9:29       ` David Edmondson
  2021-02-12 13:52       ` Igor Mammedov
  0 siblings, 2 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-12  8:49 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Wed, 10 Feb 2021 17:40:29 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Currently, we support 'hv-passthrough,hv-feature=on' enablement, this
>> is supposed to mean "hv-feature is mandatory, don't start without it". Add
>> support for 'hv-passthrough,hv-feature=off' meaning "enable everything
>> supported by the host except for hv-feature".
>> 
>> While on it, make 'hv-passthrough' parse semantics in-line with other
>> options in qemu: when specified, it overrides what was previously set with
>> what's supported by the host. This can later be modified with 'hv-feature=on'/
>> 'hv-feature=off'.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  target/i386/cpu.c     | 28 +++++++++++++++++++++++++++-
>>  target/i386/kvm/kvm.c |  4 ++++
>>  2 files changed, 31 insertions(+), 1 deletion(-)
>> 
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index e8a004c39d04..f8df2caed779 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -4725,6 +4725,29 @@ static void x86_hv_stimer_direct_set(Object *obj, bool value, Error **errp)
>>      x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER_DIRECT);
>>  }
>>  
>> +static bool x86_hv_passthrough_get(Object *obj, Error **errp)
>> +{
>> +    X86CPU *cpu = X86_CPU(obj);
>> +
>> +    return cpu->hyperv_passthrough;
>> +}
>> +
>> +static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
>> +{
>> +    X86CPU *cpu = X86_CPU(obj);
>> +
>> +    cpu->hyperv_passthrough = value;
>> +
>> +    /* hv-passthrough overrides everything with what's supported by the host */
>> +    if (value) {
>> +        cpu->hyperv_features = 0;
>> +        cpu->hyperv_features_on = 0;
>> +        cpu->hyperv_features_off = 0;
>
> why do we have _on|_off fields?
>

You mean 'why do we have them at all' or 'why do we reset them here'?
For the former, we need to distinguish between
'hv-passthroug,hv-feature=off' and just 'hv-passthrough';
'hv-passthrough,hv-evmcs=on' and just 'hv-passthrough'. For the later,
I'd like to make the samentics you've asked for:
'hv-feature,hv-passthrough' == 'hv-passthrough'
(though I still see it as a gotcha for an unprepared user)


>> +    }
>> +
>> +    return;
>> +}
>> +
>>  /* Generic getter for "feature-words" and "filtered-features" properties */
>>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>>                                        const char *name, void *opaque,
>> @@ -7281,7 +7304,6 @@ static Property x86_cpu_properties[] = {
>>                         HYPERV_SPINLOCK_NEVER_NOTIFY),
>>      DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
>>                              hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
>> -    DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
>>  
>>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>> @@ -7460,6 +7482,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>>                                     x86_hv_stimer_direct_get,
>>                                     x86_hv_stimer_direct_set);
>>  
>> +    object_class_property_add_bool(oc, "hv-passthrough",
>> +                                   x86_hv_passthrough_get,
>> +                                   x86_hv_passthrough_set);
>> +
>>      for (w = 0; w < FEATURE_WORDS; w++) {
>>          int bitnr;
>>          for (bitnr = 0; bitnr < 64; bitnr++) {
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 30013f0d7cee..fca088d4d3b5 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -1153,6 +1153,10 @@ static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
>>          return 0;
>>      }
>>  
>> +    if (cpu->hyperv_passthrough && (cpu->hyperv_features_off & BIT(feature))) {
>> +        return 0;
>> +    }
>> +
>>      deps = kvm_hyperv_properties[feature].dependencies;
>>      while (deps) {
>>          dep_feat = ctz64(deps);
>

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 20/21] i386: provide simple 'hv-default=on' option
  2021-02-11 17:23   ` Igor Mammedov
@ 2021-02-12  8:52     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-12  8:52 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Wed, 10 Feb 2021 17:40:32 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
>> requires listing all currently supported enlightenments ("hv-*" CPU
>> features) explicitly. We do have 'hv-passthrough' mode enabling
>> everything but it can't be used in production as it prevents migration.
>> 
>> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
>> Hyper-V enlightenments. Later, when new enlightenments get implemented,
>> compat_props mechanism will be used to disable them for legacy machine types,
>> this will keep 'hv-default=on' configurations migratable.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  docs/hyperv.txt   | 16 ++++++++++++---
>>  target/i386/cpu.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++
>>  target/i386/cpu.h |  3 +++
>>  3 files changed, 68 insertions(+), 3 deletions(-)
>> 
>> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> index 5df00da54fc4..a54c066cab09 100644
>> --- a/docs/hyperv.txt
>> +++ b/docs/hyperv.txt
>> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
>>  
>>  2. Setup
>>  =========
>> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
>> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
>> +All currently supported Hyper-V enlightenments can be enabled by specifying
>> +'hv-default=on' CPU flag:
>>  
>> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
>> +
>> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
>> +e.g:
>> +
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...
>> +
>> +It is also possible to disable individual enlightenments from the default list,
>> +this can be used for debugging purposes:
>> +
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
>>  
>>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
>>  check that the supplied configuration is sane.
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index f8df2caed779..013aa60272d8 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -4736,6 +4736,12 @@ static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
>>  {
>>      X86CPU *cpu = X86_CPU(obj);
>>  
>> +    if (cpu->hyperv_default) {

^^^ here ^^^

>> +        error_setg(errp,
>> +                   "'hv-default' and 'hv-paththrough' are mutually exclusive");
>> +        return;
>> +    }
>> +
>>      cpu->hyperv_passthrough = value;
>>  
>>      /* hv-passthrough overrides everything with what's supported by the host */
>> @@ -4748,6 +4754,33 @@ static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
>>      return;
>>  }
>>  
>> +static bool x86_hv_default_get(Object *obj, Error **errp)
>> +{
>> +    X86CPU *cpu = X86_CPU(obj);
>> +
>> +    return cpu->hyperv_default;
>> +}
>> +
>> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
>> +{
>> +    X86CPU *cpu = X86_CPU(obj);
>> +
>> +    if (cpu->hyperv_passthrough) {
>> +        error_setg(errp,
>> +                   "'hv-default' and 'hv-paththrough' are mutually exclusive");
> this check will work only half way, i.e.: hv-paththrough=on,hv-default=on|off
> (where off value looks a bit wierd)
> but not other way around: hv-default=on,hv-paththrough=on

The check in x86_hv_passthrough_set() chechs the opposite scenario.

>
> were you thinking about following error:
>   "hv-default can't be used after hv-paththrough were enabled"
>
> or if it symmetric, then putting this check in realizefn() will do the job
> as both properties are processed by that time.

I can move the check there but I think that two checks I add here cover
what we need (and we don't need to care what to set 'hyperv_features' to
in the interim).

>
>> +        return;
>> +    }
>> +
>> +    cpu->hyperv_default = value;
>> +
>> +    /* hv-default overrides everything with the default set */
>> +    if (value) {
>> +        cpu->hyperv_features = cpu->hyperv_default_features;
>> +        cpu->hyperv_features_on = 0;
>> +        cpu->hyperv_features_off = 0;
>> +    }
>> +}
>> +
>>  /* Generic getter for "feature-words" and "filtered-features" properties */
>>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>>                                        const char *name, void *opaque,
>> @@ -7152,6 +7185,21 @@ static void x86_cpu_initfn(Object *obj)
>>      if (xcc->model) {
>>          x86_cpu_load_model(cpu, xcc->model);
>>      }
>> +
>> +    /* Hyper-V features enabled with 'hv-default=on' */
>> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
>> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
>> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
>> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
>> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
>> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
>> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
>> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
>> +
>> +    /* Enlightened VMCS is only available on Intel/VMX */
>> +    if (kvm_hv_evmcs_available()) {
>> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
>> +    }
>>  }
>>  
>>  static int64_t x86_cpu_get_arch_id(CPUState *cs)
>> @@ -7486,6 +7534,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>>                                     x86_hv_passthrough_get,
>>                                     x86_hv_passthrough_set);
>>  
>> +    object_class_property_add_bool(oc, "hv-default",
>> +                              x86_hv_default_get,
>> +                              x86_hv_default_set);
>> +
>>      for (w = 0; w < FEATURE_WORDS; w++) {
>>          int bitnr;
>>          for (bitnr = 0; bitnr < 64; bitnr++) {
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index b4fbd46f0fc9..59350e70fb51 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -1670,6 +1670,9 @@ struct X86CPU {
>>      uint64_t hyperv_features_on;
>>      uint64_t hyperv_features_off;
>>      bool hyperv_passthrough;
>> +    /* 'hv-default' enablement */
>> +    uint64_t hyperv_default_features;
>> +    bool hyperv_default;
>>      OnOffAuto hyperv_no_nonarch_cs;
>>      uint32_t hyperv_vendor_id[3];
>>      uint32_t hyperv_interface_id[4];
>

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line
  2021-02-12  8:49     ` Vitaly Kuznetsov
@ 2021-02-12  9:29       ` David Edmondson
  2021-02-12 13:52       ` Igor Mammedov
  1 sibling, 0 replies; 58+ messages in thread
From: David Edmondson @ 2021-02-12  9:29 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Igor Mammedov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Friday, 2021-02-12 at 09:49:46 +01, Vitaly Kuznetsov wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
>
>> On Wed, 10 Feb 2021 17:40:29 +0100
>> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>>> Currently, we support 'hv-passthrough,hv-feature=on' enablement, this
>>> is supposed to mean "hv-feature is mandatory, don't start without it". Add
>>> support for 'hv-passthrough,hv-feature=off' meaning "enable everything
>>> supported by the host except for hv-feature".
>>> 
>>> While on it, make 'hv-passthrough' parse semantics in-line with other
>>> options in qemu: when specified, it overrides what was previously set with
>>> what's supported by the host. This can later be modified with 'hv-feature=on'/
>>> 'hv-feature=off'.
>>> 
>>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> ---
>>>  target/i386/cpu.c     | 28 +++++++++++++++++++++++++++-
>>>  target/i386/kvm/kvm.c |  4 ++++
>>>  2 files changed, 31 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index e8a004c39d04..f8df2caed779 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -4725,6 +4725,29 @@ static void x86_hv_stimer_direct_set(Object *obj, bool value, Error **errp)
>>>      x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER_DIRECT);
>>>  }
>>>  
>>> +static bool x86_hv_passthrough_get(Object *obj, Error **errp)
>>> +{
>>> +    X86CPU *cpu = X86_CPU(obj);
>>> +
>>> +    return cpu->hyperv_passthrough;
>>> +}
>>> +
>>> +static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
>>> +{
>>> +    X86CPU *cpu = X86_CPU(obj);
>>> +
>>> +    cpu->hyperv_passthrough = value;
>>> +
>>> +    /* hv-passthrough overrides everything with what's supported by the host */
>>> +    if (value) {
>>> +        cpu->hyperv_features = 0;
>>> +        cpu->hyperv_features_on = 0;
>>> +        cpu->hyperv_features_off = 0;
>>
>> why do we have _on|_off fields?
>>
>
> You mean 'why do we have them at all' or 'why do we reset them here'?
> For the former, we need to distinguish between
> 'hv-passthroug,hv-feature=off' and just 'hv-passthrough';
> 'hv-passthrough,hv-evmcs=on' and just 'hv-passthrough'. For the later,
> I'd like to make the samentics you've asked for:
> 'hv-feature,hv-passthrough' == 'hv-passthrough'
> (though I still see it as a gotcha for an unprepared user)

Either approach will confuse *someone*, I think.

This way at least behaves better if someone/something is composing the
feature strings via concatenation.

>>> +    }
>>> +
>>> +    return;
>>> +}
>>> +
>>>  /* Generic getter for "feature-words" and "filtered-features" properties */
>>>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>>>                                        const char *name, void *opaque,
>>> @@ -7281,7 +7304,6 @@ static Property x86_cpu_properties[] = {
>>>                         HYPERV_SPINLOCK_NEVER_NOTIFY),
>>>      DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
>>>                              hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
>>> -    DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
>>>  
>>>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>>>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>>> @@ -7460,6 +7482,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>>>                                     x86_hv_stimer_direct_get,
>>>                                     x86_hv_stimer_direct_set);
>>>  
>>> +    object_class_property_add_bool(oc, "hv-passthrough",
>>> +                                   x86_hv_passthrough_get,
>>> +                                   x86_hv_passthrough_set);
>>> +
>>>      for (w = 0; w < FEATURE_WORDS; w++) {
>>>          int bitnr;
>>>          for (bitnr = 0; bitnr < 64; bitnr++) {
>>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>>> index 30013f0d7cee..fca088d4d3b5 100644
>>> --- a/target/i386/kvm/kvm.c
>>> +++ b/target/i386/kvm/kvm.c
>>> @@ -1153,6 +1153,10 @@ static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
>>>          return 0;
>>>      }
>>>  
>>> +    if (cpu->hyperv_passthrough && (cpu->hyperv_features_off & BIT(feature))) {
>>> +        return 0;
>>> +    }
>>> +
>>>      deps = kvm_hyperv_properties[feature].dependencies;
>>>      while (deps) {
>>>          dep_feat = ctz64(deps);
>>
>
> -- 
> Vitaly

dme.
-- 
J'aurais toujours faim de toi.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line
  2021-02-12  8:49     ` Vitaly Kuznetsov
  2021-02-12  9:29       ` David Edmondson
@ 2021-02-12 13:52       ` Igor Mammedov
  1 sibling, 0 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-02-12 13:52 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Fri, 12 Feb 2021 09:49:46 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Wed, 10 Feb 2021 17:40:29 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Currently, we support 'hv-passthrough,hv-feature=on' enablement, this
> >> is supposed to mean "hv-feature is mandatory, don't start without it". Add
> >> support for 'hv-passthrough,hv-feature=off' meaning "enable everything
> >> supported by the host except for hv-feature".
> >> 
> >> While on it, make 'hv-passthrough' parse semantics in-line with other
> >> options in qemu: when specified, it overrides what was previously set with
> >> what's supported by the host. This can later be modified with 'hv-feature=on'/
> >> 'hv-feature=off'.
> >> 
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> ---
> >>  target/i386/cpu.c     | 28 +++++++++++++++++++++++++++-
> >>  target/i386/kvm/kvm.c |  4 ++++
> >>  2 files changed, 31 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index e8a004c39d04..f8df2caed779 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -4725,6 +4725,29 @@ static void x86_hv_stimer_direct_set(Object *obj, bool value, Error **errp)
> >>      x86_hv_feature_set(obj, value, HYPERV_FEAT_STIMER_DIRECT);
> >>  }
> >>  
> >> +static bool x86_hv_passthrough_get(Object *obj, Error **errp)
> >> +{
> >> +    X86CPU *cpu = X86_CPU(obj);
> >> +
> >> +    return cpu->hyperv_passthrough;
> >> +}
> >> +
> >> +static void x86_hv_passthrough_set(Object *obj, bool value, Error **errp)
> >> +{
> >> +    X86CPU *cpu = X86_CPU(obj);
> >> +
> >> +    cpu->hyperv_passthrough = value;
> >> +
> >> +    /* hv-passthrough overrides everything with what's supported by the host */
> >> +    if (value) {
> >> +        cpu->hyperv_features = 0;
> >> +        cpu->hyperv_features_on = 0;
> >> +        cpu->hyperv_features_off = 0;  
> >
> > why do we have _on|_off fields?
> >  
> 
> You mean 'why do we have them at all' or 'why do we reset them here'?
> For the former, we need to distinguish between
> 'hv-passthroug,hv-feature=off' and just 'hv-passthrough';
> 'hv-passthrough,hv-evmcs=on' and just 'hv-passthrough'. For the later,

that's what I was asking for, eventually I found it being buried in kvm code

> I'd like to make the samentics you've asked for:
> 'hv-feature,hv-passthrough' == 'hv-passthrough'
you essentially you wrote your own hv-foo parser in kvm_hyperv_expand_features(),
which is a bit complicated for my taste.
With scratch CPU you can simplify and make it easier to read.

> (though I still see it as a gotcha for an unprepared user)
this way it at least works the same way like any other property.

 
> >> +    }
> >> +
> >> +    return;
> >> +}
> >> +
> >>  /* Generic getter for "feature-words" and "filtered-features" properties */
> >>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
> >>                                        const char *name, void *opaque,
> >> @@ -7281,7 +7304,6 @@ static Property x86_cpu_properties[] = {
> >>                         HYPERV_SPINLOCK_NEVER_NOTIFY),
> >>      DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
> >>                              hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
> >> -    DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
> >>  
> >>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
> >>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
> >> @@ -7460,6 +7482,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
> >>                                     x86_hv_stimer_direct_get,
> >>                                     x86_hv_stimer_direct_set);
> >>  
> >> +    object_class_property_add_bool(oc, "hv-passthrough",
> >> +                                   x86_hv_passthrough_get,
> >> +                                   x86_hv_passthrough_set);
> >> +
> >>      for (w = 0; w < FEATURE_WORDS; w++) {
> >>          int bitnr;
> >>          for (bitnr = 0; bitnr < 64; bitnr++) {
> >> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> >> index 30013f0d7cee..fca088d4d3b5 100644
> >> --- a/target/i386/kvm/kvm.c
> >> +++ b/target/i386/kvm/kvm.c
> >> @@ -1153,6 +1153,10 @@ static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
> >>          return 0;
> >>      }
> >>  
> >> +    if (cpu->hyperv_passthrough && (cpu->hyperv_features_off & BIT(feature))) {
> >> +        return 0;
> >> +    }
> >> +
> >>      deps = kvm_hyperv_properties[feature].dependencies;
> >>      while (deps) {
> >>          dep_feat = ctz64(deps);  
> >  
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-12  8:45     ` Vitaly Kuznetsov
@ 2021-02-12 14:12       ` Igor Mammedov
  2021-02-12 15:19         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-12 14:12 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Fri, 12 Feb 2021 09:45:52 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Wed, 10 Feb 2021 17:40:28 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Sometimes we'd like to know which features were explicitly enabled and which
> >> were explicitly disabled on the command line. E.g. it seems logical to handle
> >> 'hv_passthrough,hv_feature=off' as "enable everything supported by the host
> >> except for hv_feature" but this doesn't seem to be possible with the current
> >> 'hyperv_features' bit array. Introduce 'hv_features_on'/'hv_features_off'
> >> add-ons and track explicit enablement/disablement there.
> >> 
> >> Note, it doesn't seem to be possible to fill 'hyperv_features' array during
> >> CPU creation time when 'hv-passthrough' is specified and we're running on
> >> an older kernel without KVM_CAP_SYS_HYPERV_CPUID support. To get the list
> >> of the supported Hyper-V features we need to actually create KVM VCPU and
> >> this happens much later.  
> >
> > seems to me that we are returning back to +-feat parsing, this time only for
> > hyperv.
> > I'm not sure I like it back, especially considering we are going to
> > drop "-feat" priority for x86.
> >
> > now about impossible, see arm/kvm/virt, they create a 'sample' VCPU at KVM
> > init time to probe for some CPU features in advance. You can use similar
> > approach to prepare value for hyperv_features.
> >  
> 
> KVM_CAP_SYS_HYPERV_CPUID is supported since 5.11 and eventually it'll
> make it to all kernels we care about so I'd really like to avoid any
> 'sample' CPUs for the time being. On/off parsing looks like a much
> lesser evil.
When minimum supported by QEMU kernel version gets there, you can remove
scratch CPU in QEMU (if hyperv will remain its sole user).

writing your own property parser like in this series, is possible too
but it adds extra fields to track state and hard to follow logic.
On top it adds a lot of churn by switching hv_ features to dynamic
properties, which is not necessary if scratch CPU approach is used.

Please try reusing scratch CPU approach, see
  kvm_arm_get_host_cpu_features()
for an example. You will very likely end up with simpler series,
compared to reinventing wheel.

in proto would look like:
  * kvm_init:
    hv_passthrough_cached = scratch_cpu->hyperv_features

  * property parsing time:
     x86_hv_passthrough_set()
       cpu->hyperv_features = hv_passthrough_cached
 
    all other features handled by generic property parsing,
    you don't have to do any special handling for them.

  * cpu_relalize()
     hv_expand() to check for dependencies, conflicts
     availability of features.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-12 14:12       ` Igor Mammedov
@ 2021-02-12 15:19         ` Vitaly Kuznetsov
  2021-02-12 15:26           ` Vitaly Kuznetsov
  2021-02-12 16:01           ` Igor Mammedov
  0 siblings, 2 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-12 15:19 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Fri, 12 Feb 2021 09:45:52 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Wed, 10 Feb 2021 17:40:28 +0100
>> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >  
>> >> Sometimes we'd like to know which features were explicitly enabled and which
>> >> were explicitly disabled on the command line. E.g. it seems logical to handle
>> >> 'hv_passthrough,hv_feature=off' as "enable everything supported by the host
>> >> except for hv_feature" but this doesn't seem to be possible with the current
>> >> 'hyperv_features' bit array. Introduce 'hv_features_on'/'hv_features_off'
>> >> add-ons and track explicit enablement/disablement there.
>> >> 
>> >> Note, it doesn't seem to be possible to fill 'hyperv_features' array during
>> >> CPU creation time when 'hv-passthrough' is specified and we're running on
>> >> an older kernel without KVM_CAP_SYS_HYPERV_CPUID support. To get the list
>> >> of the supported Hyper-V features we need to actually create KVM VCPU and
>> >> this happens much later.  
>> >
>> > seems to me that we are returning back to +-feat parsing, this time only for
>> > hyperv.
>> > I'm not sure I like it back, especially considering we are going to
>> > drop "-feat" priority for x86.
>> >
>> > now about impossible, see arm/kvm/virt, they create a 'sample' VCPU at KVM
>> > init time to probe for some CPU features in advance. You can use similar
>> > approach to prepare value for hyperv_features.
>> >  
>> 
>> KVM_CAP_SYS_HYPERV_CPUID is supported since 5.11 and eventually it'll
>> make it to all kernels we care about so I'd really like to avoid any
>> 'sample' CPUs for the time being. On/off parsing looks like a much
>> lesser evil.
> When minimum supported by QEMU kernel version gets there, you can remove
> scratch CPU in QEMU (if hyperv will remain its sole user).
>
> writing your own property parser like in this series, is possible too
> but it adds extra fields to track state and hard to follow logic.
> On top it adds a lot of churn by switching hv_ features to dynamic
> properties, which is not necessary if scratch CPU approach is used.
>
> Please try reusing scratch CPU approach, see
>   kvm_arm_get_host_cpu_features()
> for an example. You will very likely end up with simpler series,
> compared to reinventing wheel.

Even if I do that (and I serioulsy doubt it's going to be easier than
just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
lines long) this is not going to give us what we need to distinguish
between

'hv-passthrough,hv-evmcs'

and 

'hv-passthrough'

when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
don't want to enable it unless it was requested explicitly (former but
not the later).

Moreover, instead of just adding two 'u64's we're now doing an ioctl
which can fail, be subject to limits,... Creating and destroying a CPU
is also slow. Sorry, I hardly see how this is better, maybe just from
'code purity' point of view.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-12 15:19         ` Vitaly Kuznetsov
@ 2021-02-12 15:26           ` Vitaly Kuznetsov
  2021-02-12 16:05             ` Igor Mammedov
  2021-02-12 16:01           ` Igor Mammedov
  1 sibling, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-12 15:26 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Vitaly Kuznetsov <vkuznets@redhat.com> writes:

> Igor Mammedov <imammedo@redhat.com> writes:
>
>>
>> Please try reusing scratch CPU approach, see
>>   kvm_arm_get_host_cpu_features()
>> for an example. You will very likely end up with simpler series,
>> compared to reinventing wheel.
>
> Even if I do that (and I serioulsy doubt it's going to be easier than
> just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
> lines long) this is not going to give us what we need to distinguish
> between
>
> 'hv-passthrough,hv-evmcs'
>
> and 
>
> 'hv-passthrough'
>
> when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
> don't want to enable it unless it was requested explicitly (former but
> not the later).

... and if for whatever reason we decide that this is also bad/not
needed, I can just drop patches 16-18 from the series (leaving
'hv-passthrough,hv-feature=off' problem to better times).

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-12 15:19         ` Vitaly Kuznetsov
  2021-02-12 15:26           ` Vitaly Kuznetsov
@ 2021-02-12 16:01           ` Igor Mammedov
  2021-02-15  8:53             ` Vitaly Kuznetsov
  1 sibling, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-12 16:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Fri, 12 Feb 2021 16:19:24 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Fri, 12 Feb 2021 09:45:52 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Igor Mammedov <imammedo@redhat.com> writes:
> >>   
> >> > On Wed, 10 Feb 2021 17:40:28 +0100
> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >> >    
> >> >> Sometimes we'd like to know which features were explicitly enabled and which
> >> >> were explicitly disabled on the command line. E.g. it seems logical to handle
> >> >> 'hv_passthrough,hv_feature=off' as "enable everything supported by the host
> >> >> except for hv_feature" but this doesn't seem to be possible with the current
> >> >> 'hyperv_features' bit array. Introduce 'hv_features_on'/'hv_features_off'
> >> >> add-ons and track explicit enablement/disablement there.
> >> >> 
> >> >> Note, it doesn't seem to be possible to fill 'hyperv_features' array during
> >> >> CPU creation time when 'hv-passthrough' is specified and we're running on
> >> >> an older kernel without KVM_CAP_SYS_HYPERV_CPUID support. To get the list
> >> >> of the supported Hyper-V features we need to actually create KVM VCPU and
> >> >> this happens much later.    
> >> >
> >> > seems to me that we are returning back to +-feat parsing, this time only for
> >> > hyperv.
> >> > I'm not sure I like it back, especially considering we are going to
> >> > drop "-feat" priority for x86.
> >> >
> >> > now about impossible, see arm/kvm/virt, they create a 'sample' VCPU at KVM
> >> > init time to probe for some CPU features in advance. You can use similar
> >> > approach to prepare value for hyperv_features.
> >> >    
> >> 
> >> KVM_CAP_SYS_HYPERV_CPUID is supported since 5.11 and eventually it'll
> >> make it to all kernels we care about so I'd really like to avoid any
> >> 'sample' CPUs for the time being. On/off parsing looks like a much
> >> lesser evil.  
> > When minimum supported by QEMU kernel version gets there, you can remove
> > scratch CPU in QEMU (if hyperv will remain its sole user).
> >
> > writing your own property parser like in this series, is possible too
> > but it adds extra fields to track state and hard to follow logic.
> > On top it adds a lot of churn by switching hv_ features to dynamic
> > properties, which is not necessary if scratch CPU approach is used.
> >
> > Please try reusing scratch CPU approach, see
> >   kvm_arm_get_host_cpu_features()
> > for an example. You will very likely end up with simpler series,
> > compared to reinventing wheel.  
> 
> Even if I do that (and I serioulsy doubt it's going to be easier than
> just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
it does a lot more then what you need, kvm_arm_create_scratch_host_vcpu()
which it uses will do the job and even that could be made smaller
for hv usecase.

> lines long) this is not going to give us what we need to distinguish
> between
> 
> 'hv-passthrough,hv-evmcs'
> 
> and 
> 
> 'hv-passthrough'
> 
> when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
> don't want to enable it unless it was requested explicitly (former but
> not the later).
could you elaborate more on it, i.e. why do we need to distinguish and why
do we need evmcs without VMX if user asked for it (will it be usable)

> Moreover, instead of just adding two 'u64's we're now doing an ioctl
> which can fail, be subject to limits,... Creating and destroying a CPU
> is also slow. Sorry, I hardly see how this is better, maybe just from
> 'code purity' point of view.
readable and easy to maintain code is not a thing to neglect.




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-12 15:26           ` Vitaly Kuznetsov
@ 2021-02-12 16:05             ` Igor Mammedov
  2021-02-15  8:56               ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-12 16:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Fri, 12 Feb 2021 16:26:03 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> 
> > Igor Mammedov <imammedo@redhat.com> writes:
> >  
> >>
> >> Please try reusing scratch CPU approach, see
> >>   kvm_arm_get_host_cpu_features()
> >> for an example. You will very likely end up with simpler series,
> >> compared to reinventing wheel.  
> >
> > Even if I do that (and I serioulsy doubt it's going to be easier than
> > just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
> > lines long) this is not going to give us what we need to distinguish
> > between
> >
> > 'hv-passthrough,hv-evmcs'
> >
> > and 
> >
> > 'hv-passthrough'
> >
> > when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
> > don't want to enable it unless it was requested explicitly (former but
> > not the later).  
> 
> ... and if for whatever reason we decide that this is also bad/not
> needed, I can just drop patches 16-18 from the series (leaving
> 'hv-passthrough,hv-feature=off' problem to better times).
that's also an option,
we would need to make sure that hv-passthrough is mutually exclusive
with ''all'' other hv- properties to avoid above combination being
ever (mis)used.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-12 16:01           ` Igor Mammedov
@ 2021-02-15  8:53             ` Vitaly Kuznetsov
  2021-02-15 10:48               ` Andrew Jones
  2021-02-15 17:01               ` Igor Mammedov
  0 siblings, 2 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-15  8:53 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

>> >
>> > Please try reusing scratch CPU approach, see
>> >   kvm_arm_get_host_cpu_features()
>> > for an example. You will very likely end up with simpler series,
>> > compared to reinventing wheel.  
>> 
>> Even if I do that (and I serioulsy doubt it's going to be easier than
>> just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
> it does a lot more then what you need, kvm_arm_create_scratch_host_vcpu()
> which it uses will do the job and even that could be made smaller
> for hv usecase.
>
>> lines long) this is not going to give us what we need to distinguish
>> between
>> 
>> 'hv-passthrough,hv-evmcs'
>> 
>> and 
>> 
>> 'hv-passthrough'
>> 
>> when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
>> don't want to enable it unless it was requested explicitly (former but
>> not the later).
> could you elaborate more on it, i.e. why do we need to distinguish and why
> do we need evmcs without VMX if user asked for it (will it be usable)
>

We need to distinguish because that would be sane.

Enlightened VMCS is an extension to VMX, it can't be used without
it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,
it comes with nesting (-ExposeVirtualizationExtensions $true). When we
create a default set of Hyper-V enlightenments (either 'hv-default' or
'hv-passthrough') we should be as close as possible to genuine Hyper-V
to not create unsupported Frankenstiens which can break with any Windows
update (because nobody tested these configurations). That bein said, if
guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
there is a problem with explicit enablement: what should

'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
sound sane to me.

>> Moreover, instead of just adding two 'u64's we're now doing an ioctl
>> which can fail, be subject to limits,... Creating and destroying a CPU
>> is also slow. Sorry, I hardly see how this is better, maybe just from
>> 'code purity' point of view.
> readable and easy to maintain code is not a thing to neglect.

Of couse, but 'scratch CPU' idea is not a good design decision, it is an
ugly hack we should get rid of in ARM land, not try bringing it to other
architectures. Generally, KVM should allow to query all its capabilities
without the need to create a vCPU or, if not possible, we should create
'real' QEMU VCPUs and use one/all of the to query capabilities, avoiding
'scratch' because:
- Creating and destroying a vCPU makes VM startup slower, much
slower. E.g. for a single-CPU VM you're doubling the time required to
create vCPUs!
- vCPUs in KVM are quite memory consuming. Just 'struct kvm_vcpu_arch'
was something like 12kb last time I looked at it. 

I have no clue why scratch vCPUs were implemented on ARM, however, I'd
very much want us to avoid doing the same on x86. We do have use-cases
where startup time and consumed memory is important. There is a point in
limiting ioctls for security reasons (e.g. if I'm creating a single vCPU
VM I may want to limit userspace process to one and only one
KVM_CREATE_VCPU call).

Now to the code you complain about. The 'hard to read and maintain' code
is literaly this:

+static void x86_hv_feature_set(Object *obj, bool value, int feature)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    if (value) {
+        cpu->hyperv_features |= BIT(feature);
+        cpu->hyperv_features_on |= BIT(feature);
+        cpu->hyperv_features_off &= ~BIT(feature);
+    } else {
+        cpu->hyperv_features &= ~BIT(feature);
+        cpu->hyperv_features_on &= ~BIT(feature);
+        cpu->hyperv_features_off |= BIT(feature);
+    }
+}

I can add as many comments here as needed, however, I don't see what
requires additional explanaition. We just want to know two things:
- What's the 'effective' setting of the control
- Was it explicitly enabled or disabled on the command line.

Custom parsers are not new in QEMU and they're not going anywhere I
believe. There are options with simple enablent and there are some with
additional considerations. Trying to make CPU objects somewhat 'special'
by forcing all options to be of type-1 (and thus crippling user
experience) is not the way to go IMHO. I'd very much like us to go in
another direction, make our option parser better so my very simple
use-case is covered 'out-of-the-box'.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-12 16:05             ` Igor Mammedov
@ 2021-02-15  8:56               ` Vitaly Kuznetsov
  2021-02-15 15:55                 ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-15  8:56 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Fri, 12 Feb 2021 16:26:03 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
>> 
>> > Igor Mammedov <imammedo@redhat.com> writes:
>> >  
>> >>
>> >> Please try reusing scratch CPU approach, see
>> >>   kvm_arm_get_host_cpu_features()
>> >> for an example. You will very likely end up with simpler series,
>> >> compared to reinventing wheel.  
>> >
>> > Even if I do that (and I serioulsy doubt it's going to be easier than
>> > just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
>> > lines long) this is not going to give us what we need to distinguish
>> > between
>> >
>> > 'hv-passthrough,hv-evmcs'
>> >
>> > and 
>> >
>> > 'hv-passthrough'
>> >
>> > when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
>> > don't want to enable it unless it was requested explicitly (former but
>> > not the later).  
>> 
>> ... and if for whatever reason we decide that this is also bad/not
>> needed, I can just drop patches 16-18 from the series (leaving
>> 'hv-passthrough,hv-feature=off' problem to better times).
> that's also an option,
> we would need to make sure that hv-passthrough is mutually exclusive
> with ''all'' other hv- properties to avoid above combination being
> ever (mis)used.

That's an option to finally get these patches merged, not a good option
for end users. 

'hv-passthrough,hv-feature' works today and it's useful. Should we drop
it?

'hv-passthrough/hv-default' and 'hv-passthrough/hv-default,hv-evmcs'
should give us sane results.

'hv-passthrough,hv-feature=off' is convenient.

Why droppping this all? To save 9 (nine) lines of code in the parser? 

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-15  8:53             ` Vitaly Kuznetsov
@ 2021-02-15 10:48               ` Andrew Jones
  2021-02-15 17:01               ` Igor Mammedov
  1 sibling, 0 replies; 58+ messages in thread
From: Andrew Jones @ 2021-02-15 10:48 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Igor Mammedov, Marcelo Tosatti, qemu-devel, Eduardo Habkost,
	Paolo Bonzini

On Mon, Feb 15, 2021 at 09:53:50AM +0100, Vitaly Kuznetsov wrote:
> I have no clue why scratch vCPUs were implemented on ARM, however, I'd

We don't have an ioctl like KVM_GET_SUPPORTED_CPUID, which operates on
the KVM fd. Perhaps we should.

Thanks,
drew



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-15  8:56               ` Vitaly Kuznetsov
@ 2021-02-15 15:55                 ` Igor Mammedov
  2021-02-15 17:05                   ` Igor Mammedov
  2021-02-15 18:12                   ` Vitaly Kuznetsov
  0 siblings, 2 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-02-15 15:55 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Mon, 15 Feb 2021 09:56:19 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Fri, 12 Feb 2021 16:26:03 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> >>   
> >> > Igor Mammedov <imammedo@redhat.com> writes:
> >> >    
> >> >>
> >> >> Please try reusing scratch CPU approach, see
> >> >>   kvm_arm_get_host_cpu_features()
> >> >> for an example. You will very likely end up with simpler series,
> >> >> compared to reinventing wheel.    
> >> >
> >> > Even if I do that (and I serioulsy doubt it's going to be easier than
> >> > just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
> >> > lines long) this is not going to give us what we need to distinguish
> >> > between
> >> >
> >> > 'hv-passthrough,hv-evmcs'
> >> >
> >> > and 
> >> >
> >> > 'hv-passthrough'
> >> >
> >> > when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
> >> > don't want to enable it unless it was requested explicitly (former but
> >> > not the later).    
> >> 
> >> ... and if for whatever reason we decide that this is also bad/not
> >> needed, I can just drop patches 16-18 from the series (leaving
> >> 'hv-passthrough,hv-feature=off' problem to better times).  
> > that's also an option,
> > we would need to make sure that hv-passthrough is mutually exclusive
> > with ''all'' other hv- properties to avoid above combination being
> > ever (mis)used.  
> 
> That's an option to finally get these patches merged, not a good option
> for end users. 
> 
> 'hv-passthrough,hv-feature' works today and it's useful. Should we drop
> it?
well,
try suggested idea about using scratch CPU and it might get merged sooner.
(it's not like I'm suggesting you to rewrite half of QEMU, just some of
patches, which most likely would simplify series from my point of view
and would be easier to maintain)

> 
> 'hv-passthrough/hv-default' and 'hv-passthrough/hv-default,hv-evmcs'
> should give us sane results.
> 
> 'hv-passthrough,hv-feature=off' is convenient.
> 
> Why droppping this all? To save 9 (nine) lines of code in the parser? 
it's doing what generic property parsing is capable off, provided you
fish out hv-passthrough value in advance like arm/virt does (I think ppc
does similar hing also), so I consider it as unnecessary code duplication/
complication and maintenance burden.

If it were a hotfix during hard-freeze may be I'd agree (with promise to
rework it later to something more palatable), but it's not, for patches in
state they are now I'm not confident enough to ACK them.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-15  8:53             ` Vitaly Kuznetsov
  2021-02-15 10:48               ` Andrew Jones
@ 2021-02-15 17:01               ` Igor Mammedov
  2021-02-15 18:11                 ` Vitaly Kuznetsov
  1 sibling, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-15 17:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Mon, 15 Feb 2021 09:53:50 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> >> >
> >> > Please try reusing scratch CPU approach, see
> >> >   kvm_arm_get_host_cpu_features()
> >> > for an example. You will very likely end up with simpler series,
> >> > compared to reinventing wheel.    
> >> 
> >> Even if I do that (and I serioulsy doubt it's going to be easier than
> >> just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200  
> > it does a lot more then what you need, kvm_arm_create_scratch_host_vcpu()
> > which it uses will do the job and even that could be made smaller
> > for hv usecase.
> >  
> >> lines long) this is not going to give us what we need to distinguish
> >> between
> >> 
> >> 'hv-passthrough,hv-evmcs'
> >> 
> >> and 
> >> 
> >> 'hv-passthrough'
> >> 
> >> when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
> >> don't want to enable it unless it was requested explicitly (former but
> >> not the later).  
> > could you elaborate more on it, i.e. why do we need to distinguish and why
> > do we need evmcs without VMX if user asked for it (will it be usable)
> >  
> 
> We need to distinguish because that would be sane.
> 
> Enlightened VMCS is an extension to VMX, it can't be used without
> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,
...
> That bein said, if
> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
> there is a problem with explicit enablement: what should
> 
> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
> sound sane to me.
based on above I'd error out is user asks for unsupported option
i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out

if later on we find usecase for VMX=off + hv-evmcs=on,
we will be able to drop error without affecting existing users,
but not other way around.

> >> Moreover, instead of just adding two 'u64's we're now doing an ioctl
> >> which can fail, be subject to limits,... Creating and destroying a CPU
> >> is also slow. Sorry, I hardly see how this is better, maybe just from
> >> 'code purity' point of view.  
> > readable and easy to maintain code is not a thing to neglect.  
> 
> Of couse, but 'scratch CPU' idea is not a good design decision, it is an
> ugly hack we should get rid of in ARM land, not try bringing it to other
> architectures. Generally, KVM should allow to query all its capabilities
> without the need to create a vCPU or, if not possible, we should create
> 'real' QEMU VCPUs and use one/all of the to query capabilities, avoiding
> 'scratch' because:
> - Creating and destroying a vCPU makes VM startup slower, much
> slower. E.g. for a single-CPU VM you're doubling the time required to
> create vCPUs!
> - vCPUs in KVM are quite memory consuming. Just 'struct kvm_vcpu_arch'
> was something like 12kb last time I looked at it. 
> 
> I have no clue why scratch vCPUs were implemented on ARM, however, I'd
> very much want us to avoid doing the same on x86. We do have use-cases
> where startup time and consumed memory is important. There is a point in
> limiting ioctls for security reasons (e.g. if I'm creating a single vCPU
> VM I may want to limit userspace process to one and only one
> KVM_CREATE_VCPU call).
it should be possible to reuse scratch VCPU (kvm file descriptor) as
the first CPU of VM, if there is a will/need, without creating unnecessary overhead.
I don't like scratch CPU either but from my pov it's a lesser evil to
spawning custom parser every time someone fills like it.


> Now to the code you complain about. The 'hard to read and maintain' code
> is literaly this:
> 
> +static void x86_hv_feature_set(Object *obj, bool value, int feature)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    if (value) {
> +        cpu->hyperv_features |= BIT(feature);
> +        cpu->hyperv_features_on |= BIT(feature);
> +        cpu->hyperv_features_off &= ~BIT(feature);
> +    } else {
> +        cpu->hyperv_features &= ~BIT(feature);
> +        cpu->hyperv_features_on &= ~BIT(feature);
> +        cpu->hyperv_features_off |= BIT(feature);
> +    }
> +}
It's not just that code but the rest that uses above variables to
get final hyperv_features feature set. There is a lot of invariants
that are hidden in hv specific code that you put in hyperv kvm
specific part.

btw why can't we get supported hyperv_features in passthrough mode
during time we initialize KVM (without a vCPU)?

> I can add as many comments here as needed, however, I don't see what
> requires additional explanaition. We just want to know two things:
> - What's the 'effective' setting of the control
> - Was it explicitly enabled or disabled on the command line.
> 
> Custom parsers are not new in QEMU and they're not going anywhere I
> believe. There are options with simple enablent and there are some with
> additional considerations. Trying to make CPU objects somewhat 'special'
> by forcing all options to be of type-1 (and thus crippling user
> experience) is not the way to go IMHO. I'd very much like us to go in
> another direction, make our option parser better so my very simple
> use-case is covered 'out-of-the-box'.
there is a lot of effort spent on getting rid of custom parsers that
QEMU accumulated over years. Probably there were good reasons to add
them back then, and now someone else has to spend time to clean them up.

hyperv case is not any special in that regard (at least I'm not convinced
at this point). Try alternative(s) first, if that doesn't work out, then
custom parser might be necessary.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-15 15:55                 ` Igor Mammedov
@ 2021-02-15 17:05                   ` Igor Mammedov
  2021-02-15 18:12                   ` Vitaly Kuznetsov
  1 sibling, 0 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-02-15 17:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Mon, 15 Feb 2021 16:55:02 +0100
Igor Mammedov <imammedo@redhat.com> wrote:

> On Mon, 15 Feb 2021 09:56:19 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> 
> > Igor Mammedov <imammedo@redhat.com> writes:
> >   
> > > On Fri, 12 Feb 2021 16:26:03 +0100
> > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > >    
> > >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> > >>     
> > >> > Igor Mammedov <imammedo@redhat.com> writes:
> > >> >      
[...]
> >(I think ppc  does similar hing also)

well scratch that off, I can't find PPC part anymore. Maybe
I've confused that with something else.

[...]



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-15 17:01               ` Igor Mammedov
@ 2021-02-15 18:11                 ` Vitaly Kuznetsov
  2021-02-22 10:20                   ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-15 18:11 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

>> 
>> We need to distinguish because that would be sane.
>> 
>> Enlightened VMCS is an extension to VMX, it can't be used without
>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,
> ...
>> That bein said, if
>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
>> there is a problem with explicit enablement: what should
>> 
>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
>> sound sane to me.
> based on above I'd error out is user asks for unsupported option
> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out

That's what I keep telling you but you don't seem to listen. 'Scratch
CPU' can't possibly help with this use-case because when you parse 

'hv-passthrough,hv-evmcs,vmx=off' you

1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
host.

2) 'hv-evmcs' -> keep EVMCS bit '1'

3) 'vmx=off' -> you have no idea where EVMCS bit came from.

We have to remember which options were aquired from the host and which
were set explicitly by the user. Ok, you can replace
'hyperv_features_on' with 'evmcs_was_explicitly_requested' but how is it
better?
 
>
> if later on we find usecase for VMX=off + hv-evmcs=on,
> we will be able to drop error without affecting existing users,
> but not other way around.
>
>> >> Moreover, instead of just adding two 'u64's we're now doing an ioctl
>> >> which can fail, be subject to limits,... Creating and destroying a CPU
>> >> is also slow. Sorry, I hardly see how this is better, maybe just from
>> >> 'code purity' point of view.  
>> > readable and easy to maintain code is not a thing to neglect.  
>> 
>> Of couse, but 'scratch CPU' idea is not a good design decision, it is an
>> ugly hack we should get rid of in ARM land, not try bringing it to other
>> architectures. Generally, KVM should allow to query all its capabilities
>> without the need to create a vCPU or, if not possible, we should create
>> 'real' QEMU VCPUs and use one/all of the to query capabilities, avoiding
>> 'scratch' because:
>> - Creating and destroying a vCPU makes VM startup slower, much
>> slower. E.g. for a single-CPU VM you're doubling the time required to
>> create vCPUs!
>> - vCPUs in KVM are quite memory consuming. Just 'struct kvm_vcpu_arch'
>> was something like 12kb last time I looked at it. 
>> 
>> I have no clue why scratch vCPUs were implemented on ARM, however, I'd
>> very much want us to avoid doing the same on x86. We do have use-cases
>> where startup time and consumed memory is important. There is a point in
>> limiting ioctls for security reasons (e.g. if I'm creating a single vCPU
>> VM I may want to limit userspace process to one and only one
>> KVM_CREATE_VCPU call).
> it should be possible to reuse scratch VCPU (kvm file descriptor) as
> the first CPU of VM, if there is a will/need, without creating unnecessary overhead.
> I don't like scratch CPU either but from my pov it's a lesser evil to
> spawning custom parser every time someone fills like it.

I respectfully disagree.

>
>
>> Now to the code you complain about. The 'hard to read and maintain' code
>> is literaly this:
>> 
>> +static void x86_hv_feature_set(Object *obj, bool value, int feature)
>> +{
>> +    X86CPU *cpu = X86_CPU(obj);
>> +
>> +    if (value) {
>> +        cpu->hyperv_features |= BIT(feature);
>> +        cpu->hyperv_features_on |= BIT(feature);
>> +        cpu->hyperv_features_off &= ~BIT(feature);
>> +    } else {
>> +        cpu->hyperv_features &= ~BIT(feature);
>> +        cpu->hyperv_features_on &= ~BIT(feature);
>> +        cpu->hyperv_features_off |= BIT(feature);
>> +    }
>> +}
> It's not just that code but the rest that uses above variables to
> get final hyperv_features feature set. There is a lot of invariants
> that are hidden in hv specific code that you put in hyperv kvm
> specific part.

Could you give an example please?

>
> btw why can't we get supported hyperv_features in passthrough mode
> during time we initialize KVM (without a vCPU)?

I think I already explained that: KVM_GET_SUPPORTED_HV_CPUID works on
KVM fd from 5.11, it requires a vCPU prior to that.

>
>> I can add as many comments here as needed, however, I don't see what
>> requires additional explanaition. We just want to know two things:
>> - What's the 'effective' setting of the control
>> - Was it explicitly enabled or disabled on the command line.
>> 
>> Custom parsers are not new in QEMU and they're not going anywhere I
>> believe. There are options with simple enablent and there are some with
>> additional considerations. Trying to make CPU objects somewhat 'special'
>> by forcing all options to be of type-1 (and thus crippling user
>> experience) is not the way to go IMHO. I'd very much like us to go in
>> another direction, make our option parser better so my very simple
>> use-case is covered 'out-of-the-box'.
> there is a lot of effort spent on getting rid of custom parsers that
> QEMU accumulated over years. Probably there were good reasons to add
> them back then, and now someone else has to spend time to clean them up.
>
> hyperv case is not any special in that regard (at least I'm not convinced
> at this point). Try alternative(s) first, if that doesn't work out, then
> custom parser might be necessary.

Please explain to me how 

'hv-passthrough,hv-evmcs,-vmx' is going to throw an error
and
'hv-passthrough,-vmx' is not.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-15 15:55                 ` Igor Mammedov
  2021-02-15 17:05                   ` Igor Mammedov
@ 2021-02-15 18:12                   ` Vitaly Kuznetsov
  1 sibling, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-15 18:12 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Mon, 15 Feb 2021 09:56:19 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Fri, 12 Feb 2021 16:26:03 +0100
>> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >  
>> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
>> >>   
>> >> > Igor Mammedov <imammedo@redhat.com> writes:
>> >> >    
>> >> >>
>> >> >> Please try reusing scratch CPU approach, see
>> >> >>   kvm_arm_get_host_cpu_features()
>> >> >> for an example. You will very likely end up with simpler series,
>> >> >> compared to reinventing wheel.    
>> >> >
>> >> > Even if I do that (and I serioulsy doubt it's going to be easier than
>> >> > just adding two 'u64's, kvm_arm_get_host_cpu_features() alone is 200
>> >> > lines long) this is not going to give us what we need to distinguish
>> >> > between
>> >> >
>> >> > 'hv-passthrough,hv-evmcs'
>> >> >
>> >> > and 
>> >> >
>> >> > 'hv-passthrough'
>> >> >
>> >> > when 'hv-evmcs' *is* supported by the host. When guest CPU lacks VMX we
>> >> > don't want to enable it unless it was requested explicitly (former but
>> >> > not the later).    
>> >> 
>> >> ... and if for whatever reason we decide that this is also bad/not
>> >> needed, I can just drop patches 16-18 from the series (leaving
>> >> 'hv-passthrough,hv-feature=off' problem to better times).  
>> > that's also an option,
>> > we would need to make sure that hv-passthrough is mutually exclusive
>> > with ''all'' other hv- properties to avoid above combination being
>> > ever (mis)used.  
>> 
>> That's an option to finally get these patches merged, not a good option
>> for end users. 
>> 
>> 'hv-passthrough,hv-feature' works today and it's useful. Should we drop
>> it?
> well,
> try suggested idea about using scratch CPU and it might get merged sooner.
> (it's not like I'm suggesting you to rewrite half of QEMU, just some of
> patches, which most likely would simplify series from my point of view
> and would be easier to maintain)
>

I don't see anything in the series which will go away if I implement
this idea but as I hate it deerly I'm likely not going to.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-15 18:11                 ` Vitaly Kuznetsov
@ 2021-02-22 10:20                   ` Vitaly Kuznetsov
  2021-02-23 15:19                     ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-22 10:20 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Vitaly Kuznetsov <vkuznets@redhat.com> writes:

> Igor Mammedov <imammedo@redhat.com> writes:
>
>>> 
>>> We need to distinguish because that would be sane.
>>> 
>>> Enlightened VMCS is an extension to VMX, it can't be used without
>>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,
>> ...
>>> That bein said, if
>>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
>>> there is a problem with explicit enablement: what should
>>> 
>>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
>>> sound sane to me.
>> based on above I'd error out is user asks for unsupported option
>> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out
>
> That's what I keep telling you but you don't seem to listen. 'Scratch
> CPU' can't possibly help with this use-case because when you parse 
>
> 'hv-passthrough,hv-evmcs,vmx=off' you
>
> 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
> host.
>
> 2) 'hv-evmcs' -> keep EVMCS bit '1'
>
> 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
>
> We have to remember which options were aquired from the host and which
> were set explicitly by the user. 

Igor,

could you please comment on the above? In case my line of thought is
correct, and it is impossible to distinguish between e.g.

'hv-passthrough,hv-evmcs,-vmx'
and
'hv-passthrough,-vmx'

without a custom parser (written just exactly the way I did in this
version, for example) regardless of when 'hv-passthrough' is
expanded. E.g. we have the exact same problem with
'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing
'scratch CPUs' idea at this point because it is not going to change
anything at all ('hv_features_on' will stay, custom parsers will stay).

Am I missing something?

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-22 10:20                   ` Vitaly Kuznetsov
@ 2021-02-23 15:19                     ` Igor Mammedov
  2021-02-23 15:46                       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-23 15:19 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Mon, 22 Feb 2021 11:20:34 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> 
> > Igor Mammedov <imammedo@redhat.com> writes:
> >  
> >>> 
> >>> We need to distinguish because that would be sane.
> >>> 
> >>> Enlightened VMCS is an extension to VMX, it can't be used without
> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,  
> >> ...  
> >>> That bein said, if
> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
> >>> there is a problem with explicit enablement: what should
> >>> 
> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
> >>> sound sane to me.  
> >> based on above I'd error out is user asks for unsupported option
> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out  
> >
> > That's what I keep telling you but you don't seem to listen. 'Scratch
> > CPU' can't possibly help with this use-case because when you parse 
> >
> > 'hv-passthrough,hv-evmcs,vmx=off' you
> >
> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
> > host.
> >
> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
> >
> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
> >
> > We have to remember which options were aquired from the host and which
> > were set explicitly by the user.   
> 
> Igor,
> 
> could you please comment on the above? In case my line of thought is
> correct, and it is impossible to distinguish between e.g.
> 
> 'hv-passthrough,hv-evmcs,-vmx'
> and
> 'hv-passthrough,-vmx'
> 
> without a custom parser (written just exactly the way I did in this
> version, for example) regardless of when 'hv-passthrough' is
> expanded. E.g. we have the exact same problem with
> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing

right, if we need to distinguish between explicit and implicit hv-evmcs set by
hv-passthrough custom parser probably the way to go.

However do we need actually need to do it?
I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
and it applies not only hv-evmcs but other features hv-passthrough might set
(i.e. if whatever was [un]set by hv-passthrough in combination with other
features results in invalid config, QEMU shall error out instead of magically
altering host provided hv-passthrough value).

something like:
  'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
should result in
  error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
                 " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"

making host's features set, *magically* mutable, depending on other user provided features
is a bit confusing. One would never know what hv-passthrough actually means, and if
enabling/disabling 'random' feature changes it.

It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
in case it ends up in nonsense configuration.

> 'scratch CPUs' idea at this point because it is not going to change
> anything at all ('hv_features_on' will stay, custom parsers will stay).g
> 
> Am I missing something?
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-23 15:19                     ` Igor Mammedov
@ 2021-02-23 15:46                       ` Vitaly Kuznetsov
  2021-02-23 17:48                         ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-23 15:46 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Mon, 22 Feb 2021 11:20:34 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
>> 
>> > Igor Mammedov <imammedo@redhat.com> writes:
>> >  
>> >>> 
>> >>> We need to distinguish because that would be sane.
>> >>> 
>> >>> Enlightened VMCS is an extension to VMX, it can't be used without
>> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,  
>> >> ...  
>> >>> That bein said, if
>> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
>> >>> there is a problem with explicit enablement: what should
>> >>> 
>> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
>> >>> sound sane to me.  
>> >> based on above I'd error out is user asks for unsupported option
>> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out  
>> >
>> > That's what I keep telling you but you don't seem to listen. 'Scratch
>> > CPU' can't possibly help with this use-case because when you parse 
>> >
>> > 'hv-passthrough,hv-evmcs,vmx=off' you
>> >
>> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
>> > host.
>> >
>> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
>> >
>> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
>> >
>> > We have to remember which options were aquired from the host and which
>> > were set explicitly by the user.   
>> 
>> Igor,
>> 
>> could you please comment on the above? In case my line of thought is
>> correct, and it is impossible to distinguish between e.g.
>> 
>> 'hv-passthrough,hv-evmcs,-vmx'
>> and
>> 'hv-passthrough,-vmx'
>> 
>> without a custom parser (written just exactly the way I did in this
>> version, for example) regardless of when 'hv-passthrough' is
>> expanded. E.g. we have the exact same problem with
>> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing
>
> right, if we need to distinguish between explicit and implicit hv-evmcs set by
> hv-passthrough custom parser probably the way to go.
>
> However do we need actually need to do it?

I think we really need that. See below ...

> I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
> and it applies not only hv-evmcs but other features hv-passthrough might set
> (i.e. if whatever was [un]set by hv-passthrough in combination with other
> features results in invalid config, QEMU shall error out instead of magically
> altering host provided hv-passthrough value).
>
> something like:
>   'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
> should result in
>   error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
>                  " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"
>
> making host's features set, *magically* mutable, depending on other user provided features
> is a bit confusing. One would never know what hv-passthrough actually means, and if
> enabling/disabling 'random' feature changes it.
>
> It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
> in case it ends up in nonsense configuration.
>

I don't seem to agree this is a sane behavior, especially if you replace
'hv-passthrough' with 'hv-default' above. Removing 'vmx' from CPU for
Windows guests is common if you'd want to avoid nested configuration:
even without any Hyper-V guests created, Windows itself is a Hyper-V
partition.

So a sane user will do:

'-cpu host,hv-default,vmx=off' 

and on Intel he will get an error, and on AMD he won't. 

So what you're suggesting actually defeats the whole purpose of
'hv-default' as upper-layer tools (think libvirt) will need to know that
Intel configurations for Windows guests are somewhat different. They'll
need to know what 'hv-evmcs' is. We're back to where we've started.

If we are to follow this approach let's just throw away 'hv-evmcs' from
'hv-default' set, it's going to be much cleaner. But again, I don't
really believe it's the right way to go.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-23 15:46                       ` Vitaly Kuznetsov
@ 2021-02-23 17:48                         ` Igor Mammedov
  2021-02-23 18:08                           ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-23 17:48 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Tue, 23 Feb 2021 16:46:50 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Mon, 22 Feb 2021 11:20:34 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> >>   
> >> > Igor Mammedov <imammedo@redhat.com> writes:
> >> >    
> >> >>> 
> >> >>> We need to distinguish because that would be sane.
> >> >>> 
> >> >>> Enlightened VMCS is an extension to VMX, it can't be used without
> >> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,    
> >> >> ...    
> >> >>> That bein said, if
> >> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
> >> >>> there is a problem with explicit enablement: what should
> >> >>> 
> >> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
> >> >>> sound sane to me.    
> >> >> based on above I'd error out is user asks for unsupported option
> >> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out    
> >> >
> >> > That's what I keep telling you but you don't seem to listen. 'Scratch
> >> > CPU' can't possibly help with this use-case because when you parse 
> >> >
> >> > 'hv-passthrough,hv-evmcs,vmx=off' you
> >> >
> >> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
> >> > host.
> >> >
> >> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
> >> >
> >> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
> >> >
> >> > We have to remember which options were aquired from the host and which
> >> > were set explicitly by the user.     
> >> 
> >> Igor,
> >> 
> >> could you please comment on the above? In case my line of thought is
> >> correct, and it is impossible to distinguish between e.g.
> >> 
> >> 'hv-passthrough,hv-evmcs,-vmx'
> >> and
> >> 'hv-passthrough,-vmx'
> >> 
> >> without a custom parser (written just exactly the way I did in this
> >> version, for example) regardless of when 'hv-passthrough' is
> >> expanded. E.g. we have the exact same problem with
> >> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing  
> >
> > right, if we need to distinguish between explicit and implicit hv-evmcs set by
> > hv-passthrough custom parser probably the way to go.
> >
> > However do we need actually need to do it?  
> 
> I think we really need that. See below ...
> 
> > I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
> > and it applies not only hv-evmcs but other features hv-passthrough might set
> > (i.e. if whatever was [un]set by hv-passthrough in combination with other
> > features results in invalid config, QEMU shall error out instead of magically
> > altering host provided hv-passthrough value).
> >
> > something like:
> >   'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
> > should result in
> >   error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
> >                  " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"
> >
> > making host's features set, *magically* mutable, depending on other user provided features
> > is a bit confusing. One would never know what hv-passthrough actually means, and if
> > enabling/disabling 'random' feature changes it.
> >
> > It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
> > in case it ends up in nonsense configuration.
> >  
> 
> I don't seem to agree this is a sane behavior, especially if you replace
> 'hv-passthrough' with 'hv-default' above. Removing 'vmx' from CPU for
> Windows guests is common if you'd want to avoid nested configuration:
> even without any Hyper-V guests created, Windows itself is a Hyper-V
> partition.
> 
> So a sane user will do:
> 
> '-cpu host,hv-default,vmx=off' 
> 
> and on Intel he will get an error, and on AMD he won't. 
> 
> So what you're suggesting actually defeats the whole purpose of
> 'hv-default' as upper-layer tools (think libvirt) will need to know that
I'd assume it would be hard for libvirt to use 'hv-default' from migration
point of view. It's semi opaque (one can find out what features it sets
indirectly inspecting individual hv_foo features, and mgmt will need to
know about them). If it will mutate when other features [un]set, upper
layers might need to enumerate all these permutations to know which hosts
are compatible or compare host feature sets every time before attempting
migration.

> Intel configurations for Windows guests are somewhat different. They'll
> need to know what 'hv-evmcs' is. We're back to where we've started.

we were talking about hv-passthrough, and if host advertises hv-evmcs
QEMU should complain if user disabled features it depends on (
not silently fixing up configuration error).
But the same applies to hv-default.

> If we are to follow this approach let's just throw away 'hv-evmcs' from
> 'hv-default' set, it's going to be much cleaner. But again, I don't
> really believe it's the right way to go.

if desired behavior, on Intel host for above config, to start without error
then indeed defaults should not set 'hv-evmcs' if it results in invalid
feature set.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-23 17:48                         ` Igor Mammedov
@ 2021-02-23 18:08                           ` Vitaly Kuznetsov
  2021-02-24 16:06                             ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-23 18:08 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Tue, 23 Feb 2021 16:46:50 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Mon, 22 Feb 2021 11:20:34 +0100
>> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >  
>> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
>> >>   
>> >> > Igor Mammedov <imammedo@redhat.com> writes:
>> >> >    
>> >> >>> 
>> >> >>> We need to distinguish because that would be sane.
>> >> >>> 
>> >> >>> Enlightened VMCS is an extension to VMX, it can't be used without
>> >> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,    
>> >> >> ...    
>> >> >>> That bein said, if
>> >> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
>> >> >>> there is a problem with explicit enablement: what should
>> >> >>> 
>> >> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
>> >> >>> sound sane to me.    
>> >> >> based on above I'd error out is user asks for unsupported option
>> >> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out    
>> >> >
>> >> > That's what I keep telling you but you don't seem to listen. 'Scratch
>> >> > CPU' can't possibly help with this use-case because when you parse 
>> >> >
>> >> > 'hv-passthrough,hv-evmcs,vmx=off' you
>> >> >
>> >> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
>> >> > host.
>> >> >
>> >> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
>> >> >
>> >> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
>> >> >
>> >> > We have to remember which options were aquired from the host and which
>> >> > were set explicitly by the user.     
>> >> 
>> >> Igor,
>> >> 
>> >> could you please comment on the above? In case my line of thought is
>> >> correct, and it is impossible to distinguish between e.g.
>> >> 
>> >> 'hv-passthrough,hv-evmcs,-vmx'
>> >> and
>> >> 'hv-passthrough,-vmx'
>> >> 
>> >> without a custom parser (written just exactly the way I did in this
>> >> version, for example) regardless of when 'hv-passthrough' is
>> >> expanded. E.g. we have the exact same problem with
>> >> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing  
>> >
>> > right, if we need to distinguish between explicit and implicit hv-evmcs set by
>> > hv-passthrough custom parser probably the way to go.
>> >
>> > However do we need actually need to do it?  
>> 
>> I think we really need that. See below ...
>> 
>> > I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
>> > and it applies not only hv-evmcs but other features hv-passthrough might set
>> > (i.e. if whatever was [un]set by hv-passthrough in combination with other
>> > features results in invalid config, QEMU shall error out instead of magically
>> > altering host provided hv-passthrough value).
>> >
>> > something like:
>> >   'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
>> > should result in
>> >   error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
>> >                  " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"
>> >
>> > making host's features set, *magically* mutable, depending on other user provided features
>> > is a bit confusing. One would never know what hv-passthrough actually means, and if
>> > enabling/disabling 'random' feature changes it.
>> >
>> > It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
>> > in case it ends up in nonsense configuration.
>> >  
>> 
>> I don't seem to agree this is a sane behavior, especially if you replace
>> 'hv-passthrough' with 'hv-default' above. Removing 'vmx' from CPU for
>> Windows guests is common if you'd want to avoid nested configuration:
>> even without any Hyper-V guests created, Windows itself is a Hyper-V
>> partition.
>> 
>> So a sane user will do:
>> 
>> '-cpu host,hv-default,vmx=off' 
>> 
>> and on Intel he will get an error, and on AMD he won't. 
>> 
>> So what you're suggesting actually defeats the whole purpose of
>> 'hv-default' as upper-layer tools (think libvirt) will need to know that
> I'd assume it would be hard for libvirt to use 'hv-default' from migration
> point of view. It's semi opaque (one can find out what features it sets
> indirectly inspecting individual hv_foo features, and mgmt will need to
> know about them). If it will mutate when other features [un]set, upper
> layers might need to enumerate all these permutations to know which hosts
> are compatible or compare host feature sets every time before attempting
> migration.
>

That's exactly the opposite of what's the goal here which is: make it
possible for upper layers to not know anything about Hyper-V
enlightenments besides 'hv-default'. Migration should work just fine, if
the rest of guest configuration matches -- then 'hv-default' will create
the exact same things (e.g. if 'vmx' was disabled on the source it has
to be enabled on the destination, it can't be different)


>> Intel configurations for Windows guests are somewhat different. They'll
>> need to know what 'hv-evmcs' is. We're back to where we've started.
>
> we were talking about hv-passthrough, and if host advertises hv-evmcs
> QEMU should complain if user disabled features it depends on (
> not silently fixing up configuration error).
> But the same applies to hv-default.

Let's forget about hv-passthrough completely for a while as this series
is kind of unrelated to it.

In the previous submission I was setting 'hv-default' based on host
availability of the feature only. That is: set on Intel, unset on
AMD. We have to at least preserve that because it would be insane to
crash on

-cpu host,hv-default 

on AMD because AMD doesn't (and never will!) support hv-evmcs, right?

>
>> If we are to follow this approach let's just throw away 'hv-evmcs' from
>> 'hv-default' set, it's going to be much cleaner. But again, I don't
>> really believe it's the right way to go.
>
> if desired behavior, on Intel host for above config, to start without error
> then indeed defaults should not set 'hv-evmcs' if it results in invalid
> feature set.

This is problematic as it is still sane for everyone to enable it as it
gives performance advantage. If we just for a second forget about custom
parsers and all that -- which is just an implementation detail, why can't
we tie 'hv-evmcs' bit in 'hv-default' to 'vxm' 1:1?

Again, the end goal is: make it possible for upper layers to now know
anything about Hyper-V enlightenments other than 'hv-default'. Technically,
it is possible to make it work.

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-23 18:08                           ` Vitaly Kuznetsov
@ 2021-02-24 16:06                             ` Igor Mammedov
  2021-02-24 17:00                               ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-02-24 16:06 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Tue, 23 Feb 2021 19:08:42 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Tue, 23 Feb 2021 16:46:50 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Igor Mammedov <imammedo@redhat.com> writes:
> >>   
> >> > On Mon, 22 Feb 2021 11:20:34 +0100
> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >> >    
> >> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> >> >>     
> >> >> > Igor Mammedov <imammedo@redhat.com> writes:
> >> >> >      
> >> >> >>> 
> >> >> >>> We need to distinguish because that would be sane.
> >> >> >>> 
> >> >> >>> Enlightened VMCS is an extension to VMX, it can't be used without
> >> >> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,      
> >> >> >> ...      
> >> >> >>> That bein said, if
> >> >> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
> >> >> >>> there is a problem with explicit enablement: what should
> >> >> >>> 
> >> >> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
> >> >> >>> sound sane to me.      
> >> >> >> based on above I'd error out is user asks for unsupported option
> >> >> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out      
> >> >> >
> >> >> > That's what I keep telling you but you don't seem to listen. 'Scratch
> >> >> > CPU' can't possibly help with this use-case because when you parse 
> >> >> >
> >> >> > 'hv-passthrough,hv-evmcs,vmx=off' you
> >> >> >
> >> >> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
> >> >> > host.
> >> >> >
> >> >> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
> >> >> >
> >> >> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
> >> >> >
> >> >> > We have to remember which options were aquired from the host and which
> >> >> > were set explicitly by the user.       
> >> >> 
> >> >> Igor,
> >> >> 
> >> >> could you please comment on the above? In case my line of thought is
> >> >> correct, and it is impossible to distinguish between e.g.
> >> >> 
> >> >> 'hv-passthrough,hv-evmcs,-vmx'
> >> >> and
> >> >> 'hv-passthrough,-vmx'
> >> >> 
> >> >> without a custom parser (written just exactly the way I did in this
> >> >> version, for example) regardless of when 'hv-passthrough' is
> >> >> expanded. E.g. we have the exact same problem with
> >> >> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing    
> >> >
> >> > right, if we need to distinguish between explicit and implicit hv-evmcs set by
> >> > hv-passthrough custom parser probably the way to go.
> >> >
> >> > However do we need actually need to do it?    
> >> 
> >> I think we really need that. See below ...
> >>   
> >> > I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
> >> > and it applies not only hv-evmcs but other features hv-passthrough might set
> >> > (i.e. if whatever was [un]set by hv-passthrough in combination with other
> >> > features results in invalid config, QEMU shall error out instead of magically
> >> > altering host provided hv-passthrough value).
> >> >
> >> > something like:
> >> >   'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
> >> > should result in
> >> >   error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
> >> >                  " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"
> >> >
> >> > making host's features set, *magically* mutable, depending on other user provided features
> >> > is a bit confusing. One would never know what hv-passthrough actually means, and if
> >> > enabling/disabling 'random' feature changes it.
> >> >
> >> > It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
> >> > in case it ends up in nonsense configuration.
> >> >    
> >> 
> >> I don't seem to agree this is a sane behavior, especially if you replace
> >> 'hv-passthrough' with 'hv-default' above. Removing 'vmx' from CPU for
> >> Windows guests is common if you'd want to avoid nested configuration:
> >> even without any Hyper-V guests created, Windows itself is a Hyper-V
> >> partition.
> >> 
> >> So a sane user will do:
> >> 
> >> '-cpu host,hv-default,vmx=off' 
> >> 
> >> and on Intel he will get an error, and on AMD he won't. 
> >> 
> >> So what you're suggesting actually defeats the whole purpose of
> >> 'hv-default' as upper-layer tools (think libvirt) will need to know that  
> > I'd assume it would be hard for libvirt to use 'hv-default' from migration
> > point of view. It's semi opaque (one can find out what features it sets
> > indirectly inspecting individual hv_foo features, and mgmt will need to
> > know about them). If it will mutate when other features [un]set, upper
> > layers might need to enumerate all these permutations to know which hosts
> > are compatible or compare host feature sets every time before attempting
> > migration.
> 
> That's exactly the opposite of what's the goal here which is: make it
> possible for upper layers to not know anything about Hyper-V
> enlightenments besides 'hv-default'. Migration should work just fine, if
> the rest of guest configuration matches -- then 'hv-default' will create
> the exact same things (e.g. if 'vmx' was disabled on the source it has
            ^^^^^
I'm not convinced in that yet (not with current impl. more on that at the end of reply)

> to be enabled on the destination, it can't be different)
> 
> 
> >> Intel configurations for Windows guests are somewhat different. They'll
> >> need to know what 'hv-evmcs' is. We're back to where we've started.  
> >
> > we were talking about hv-passthrough, and if host advertises hv-evmcs
> > QEMU should complain if user disabled features it depends on (
> > not silently fixing up configuration error).
> > But the same applies to hv-default.  
> 
> Let's forget about hv-passthrough completely for a while as this series
> is kind of unrelated to it.

It adds a lot for unrelated code (not just couple of lines),
I've played with scratch CPU idea, here is demo of it
https://github.com/imammedo/qemu/commit/a4b107d5368ebf72d45082bc8310a6b88a4ba6fb
I didn't rework caps/cpuid querying parts (just hacked around it),
and even without that it saves us ~200LOC (not a small part of which comes
with this series).
I also split horrible hv_cpuid_check_and_set into separate 'set' and 'check' stages.
Granted it was sort-of pre-existing ugly code, some of your
re-factoring made it a bit better but it's still far from readable.
 
> In the previous submission I was setting 'hv-default' based on host
> availability of the feature only. That is: set on Intel, unset on
> AMD. We have to at least preserve that because it would be insane to
> crash on
> 
> -cpu host,hv-default 
> 
> on AMD because AMD doesn't (and never will!) support hv-evmcs, right?

If QEMU prevents cross arch migration i.e. it's not supported,
then I guess we can make hv-default different depending on AMD or Intel host.
If not then we might need to be conservative i.e. exclude hv-evmcs from defaults.

> >> If we are to follow this approach let's just throw away 'hv-evmcs' from
> >> 'hv-default' set, it's going to be much cleaner. But again, I don't
> >> really believe it's the right way to go.  
> >
> > if desired behavior, on Intel host for above config, to start without error
> > then indeed defaults should not set 'hv-evmcs' if it results in invalid
> > feature set.  
> 
> This is problematic as it is still sane for everyone to enable it as it
> gives performance advantage. If we just for a second forget about custom
  "
    > >> So a sane user will do:
    > >> 
    > >> '-cpu host,hv-default,vmx=off'
  "
  it's not easy picking defaults.

> parsers and all that -- which is just an implementation detail, why can't
> we tie 'hv-evmcs' bit in 'hv-default' to 'vxm' 1:1?
migration wise I don't see issues wrt vmx=off turning of hv-evmcs,
however ...

we were replacing user input fixups with hard errors asking
user to fix CLI and removing custom parsers in favor of generic ones.

In vmx=off case we would be fixing up what 'hv-default' explicitly set.
Same applies to other hv-foo set by hv-default.

ex: 'hv-default,hv-dep1=off', will turn off some dependent feature
for other hv feature in hv-default set and it will error out,
same goes on for enabling feature that has dependencies.
Why should we treat hv-evmcs/vmx pair any different?

Granted exiting with error is not the best UX, but at least it says to user
what's wrong with CLI and how to fix it. Also it lets to keep QEMU code
manageable and with consistent behavior.

> Again, the end goal is: make it possible for upper layers to now know
> anything about Hyper-V enlightenments other than 'hv-default'.
I'm still doubtful about feasibility of this goal when migration is considered.
It sure would work if hosts are identical hw/sw wise.
In mixed setup all features, except of hv-evmcs, that included in 'hv-default',
will error out in case host doesn't support it, which should prevent
incompatible migration so that's also fine.

But hv-evmcs will silently go away if host doesn't support it,
which is issue when migration happens to/from host that supports it.

Maybe to help mgmt to figure out hosts compatibility
  1. it should know about hv-evmcs to query it's status
  2. or default value set by 'hv-default' should be exposed to mgmt
     so it could compare whole feature-set in one go without being
     aware of individual features.

Additionally on QEMU side for such conditional features we can
theoretically add a subsection to migration stream when feature
is enabled, that way we at least can prevent 'successful' migration,
when destination value doesn't match. But this might already be
over-engineering on my part.


> Technically, it is possible to make it work.






^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-24 16:06                             ` Igor Mammedov
@ 2021-02-24 17:00                               ` Vitaly Kuznetsov
  2021-03-01 15:32                                 ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-02-24 17:00 UTC (permalink / raw)
  To: Igor Mammedov, Eduardo Habkost
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, qemu-devel

Igor Mammedov <imammedo@redhat.com> writes:

> On Tue, 23 Feb 2021 19:08:42 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Tue, 23 Feb 2021 16:46:50 +0100
>> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >  
>> >> Igor Mammedov <imammedo@redhat.com> writes:
>> >>   
>> >> > On Mon, 22 Feb 2021 11:20:34 +0100
>> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >> >    
>> >> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
>> >> >>     
>> >> >> > Igor Mammedov <imammedo@redhat.com> writes:
>> >> >> >      
>> >> >> >>> 
>> >> >> >>> We need to distinguish because that would be sane.
>> >> >> >>> 
>> >> >> >>> Enlightened VMCS is an extension to VMX, it can't be used without
>> >> >> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,      
>> >> >> >> ...      
>> >> >> >>> That bein said, if
>> >> >> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
>> >> >> >>> there is a problem with explicit enablement: what should
>> >> >> >>> 
>> >> >> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
>> >> >> >>> sound sane to me.      
>> >> >> >> based on above I'd error out is user asks for unsupported option
>> >> >> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out      
>> >> >> >
>> >> >> > That's what I keep telling you but you don't seem to listen. 'Scratch
>> >> >> > CPU' can't possibly help with this use-case because when you parse 
>> >> >> >
>> >> >> > 'hv-passthrough,hv-evmcs,vmx=off' you
>> >> >> >
>> >> >> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
>> >> >> > host.
>> >> >> >
>> >> >> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
>> >> >> >
>> >> >> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
>> >> >> >
>> >> >> > We have to remember which options were aquired from the host and which
>> >> >> > were set explicitly by the user.       
>> >> >> 
>> >> >> Igor,
>> >> >> 
>> >> >> could you please comment on the above? In case my line of thought is
>> >> >> correct, and it is impossible to distinguish between e.g.
>> >> >> 
>> >> >> 'hv-passthrough,hv-evmcs,-vmx'
>> >> >> and
>> >> >> 'hv-passthrough,-vmx'
>> >> >> 
>> >> >> without a custom parser (written just exactly the way I did in this
>> >> >> version, for example) regardless of when 'hv-passthrough' is
>> >> >> expanded. E.g. we have the exact same problem with
>> >> >> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing    
>> >> >
>> >> > right, if we need to distinguish between explicit and implicit hv-evmcs set by
>> >> > hv-passthrough custom parser probably the way to go.
>> >> >
>> >> > However do we need actually need to do it?    
>> >> 
>> >> I think we really need that. See below ...
>> >>   
>> >> > I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
>> >> > and it applies not only hv-evmcs but other features hv-passthrough might set
>> >> > (i.e. if whatever was [un]set by hv-passthrough in combination with other
>> >> > features results in invalid config, QEMU shall error out instead of magically
>> >> > altering host provided hv-passthrough value).
>> >> >
>> >> > something like:
>> >> >   'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
>> >> > should result in
>> >> >   error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
>> >> >                  " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"
>> >> >
>> >> > making host's features set, *magically* mutable, depending on other user provided features
>> >> > is a bit confusing. One would never know what hv-passthrough actually means, and if
>> >> > enabling/disabling 'random' feature changes it.
>> >> >
>> >> > It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
>> >> > in case it ends up in nonsense configuration.
>> >> >    
>> >> 
>> >> I don't seem to agree this is a sane behavior, especially if you replace
>> >> 'hv-passthrough' with 'hv-default' above. Removing 'vmx' from CPU for
>> >> Windows guests is common if you'd want to avoid nested configuration:
>> >> even without any Hyper-V guests created, Windows itself is a Hyper-V
>> >> partition.
>> >> 
>> >> So a sane user will do:
>> >> 
>> >> '-cpu host,hv-default,vmx=off' 
>> >> 
>> >> and on Intel he will get an error, and on AMD he won't. 
>> >> 
>> >> So what you're suggesting actually defeats the whole purpose of
>> >> 'hv-default' as upper-layer tools (think libvirt) will need to know that  
>> > I'd assume it would be hard for libvirt to use 'hv-default' from migration
>> > point of view. It's semi opaque (one can find out what features it sets
>> > indirectly inspecting individual hv_foo features, and mgmt will need to
>> > know about them). If it will mutate when other features [un]set, upper
>> > layers might need to enumerate all these permutations to know which hosts
>> > are compatible or compare host feature sets every time before attempting
>> > migration.
>> 
>> That's exactly the opposite of what's the goal here which is: make it
>> possible for upper layers to not know anything about Hyper-V
>> enlightenments besides 'hv-default'. Migration should work just fine, if
>> the rest of guest configuration matches -- then 'hv-default' will create
>> the exact same things (e.g. if 'vmx' was disabled on the source it has
>             ^^^^^
> I'm not convinced in that yet (not with current impl. more on that at the end of reply)
>
>> to be enabled on the destination, it can't be different)
>> 
>> 
>> >> Intel configurations for Windows guests are somewhat different. They'll
>> >> need to know what 'hv-evmcs' is. We're back to where we've started.  
>> >
>> > we were talking about hv-passthrough, and if host advertises hv-evmcs
>> > QEMU should complain if user disabled features it depends on (
>> > not silently fixing up configuration error).
>> > But the same applies to hv-default.  
>> 
>> Let's forget about hv-passthrough completely for a while as this series
>> is kind of unrelated to it.
>
> It adds a lot for unrelated code (not just couple of lines),
> I've played with scratch CPU idea, here is demo of it
> https://github.com/imammedo/qemu/commit/a4b107d5368ebf72d45082bc8310a6b88a4ba6fb
> I didn't rework caps/cpuid querying parts (just hacked around it),
> and even without that it saves us ~200LOC (not a small part of which comes
> with this series).

All your savings come from throwing away custom parsers -- which are not
needed at all if we don't distinguish between

'hv-default,hv-evmcs' and 'hv-default'

it's just not needed, don't count these patches in. Or, if it is needed,
please explain how your scratch CPU is making things different. I guess
it is not so we can discuss this outside of this series.

> I also split horrible hv_cpuid_check_and_set into separate 'set' and 'check' stages.
> Granted it was sort-of pre-existing ugly code, some of your
> re-factoring made it a bit better but it's still far from readable.
>  

hv_cpuid_check_and_set() is already there, I'm not at all opposed to
making this code even better but I don't see it as a must for this
particular feature (hv-default).

>> In the previous submission I was setting 'hv-default' based on host
>> availability of the feature only. That is: set on Intel, unset on
>> AMD. We have to at least preserve that because it would be insane to
>> crash on
>> 
>> -cpu host,hv-default 
>> 
>> on AMD because AMD doesn't (and never will!) support hv-evmcs, right?
>
> If QEMU prevents cross arch migration i.e. it's not supported,
> then I guess we can make hv-default different depending on AMD or Intel host.
> If not then we might need to be conservative i.e. exclude hv-evmcs from defaults.
>

Forget about cross vendor. I want to tie it 1:1 to VMX feature on the
guest CPU -- which happens to be only available on Intel. It is
absolutely impossible to migrate VMX enabled guest to VMX-disabled
destination, with or without evmcs.


>> >> If we are to follow this approach let's just throw away 'hv-evmcs' from
>> >> 'hv-default' set, it's going to be much cleaner. But again, I don't
>> >> really believe it's the right way to go.  
>> >
>> > if desired behavior, on Intel host for above config, to start without error
>> > then indeed defaults should not set 'hv-evmcs' if it results in invalid
>> > feature set.  
>> 
>> This is problematic as it is still sane for everyone to enable it as it
>> gives performance advantage. If we just for a second forget about custom
>   "
>     > >> So a sane user will do:
>     > >> 
>     > >> '-cpu host,hv-default,vmx=off'
>   "
>   it's not easy picking defaults.
>
>> parsers and all that -- which is just an implementation detail, why can't
>> we tie 'hv-evmcs' bit in 'hv-default' to 'vxm' 1:1?
> migration wise I don't see issues wrt vmx=off turning of hv-evmcs,
> however ...
>
> we were replacing user input fixups with hard errors asking
> user to fix CLI and removing custom parsers in favor of generic ones.
>
> In vmx=off case we would be fixing up what 'hv-default' explicitly set.
> Same applies to other hv-foo set by hv-default.
>
> ex: 'hv-default,hv-dep1=off', will turn off some dependent feature
> for other hv feature in hv-default set and it will error out,
> same goes on for enabling feature that has dependencies.
> Why should we treat hv-evmcs/vmx pair any different?
>

VMX is not part of Hyper-V enlightenments, is it? It can also be coming
from a CPU model:

"-cpu MyLovelyModelWithVmx,hv-default"

should not throw an error!

Again, the goal is for userspace to not know anything besides
'hv-default' for Hyper-V enlightenments.


> Granted exiting with error is not the best UX, but at least it says to user
> what's wrong with CLI and how to fix it. Also it lets to keep QEMU code
> manageable and with consistent behavior.

Enabling EVMCS only when guest CPU has VMX is a smart behavior, all
users want that. It is very consistent with how genuine Hyper-V behaves.

>
>> Again, the end goal is: make it possible for upper layers to now know
>> anything about Hyper-V enlightenments other than 'hv-default'.
> I'm still doubtful about feasibility of this goal when migration is considered.
> It sure would work if hosts are identical hw/sw wise.
> In mixed setup all features, except of hv-evmcs, that included in 'hv-default',
> will error out in case host doesn't support it, which should prevent
> incompatible migration so that's also fine.
>
> But hv-evmcs will silently go away if host doesn't support it,
> which is issue when migration happens to/from host that supports it.

But how can it go away? In KVM, hv-evmcs support is not conditional,
basically, all KVMs which support netsting state migration also support
EVMCS. 

>
> Maybe to help mgmt to figure out hosts compatibility
>   1. it should know about hv-evmcs to query it's status
>   2. or default value set by 'hv-default' should be exposed to mgmt
>      so it could compare whole feature-set in one go without being
>      aware of individual features.
>
> Additionally on QEMU side for such conditional features we can
> theoretically add a subsection to migration stream when feature
> is enabled, that way we at least can prevent 'successful' migration,
> when destination value doesn't match. But this might already be
> over-engineering on my part.

I think you're trying to solve an issue which doesn't exist. In case
we're successfully migrating a nested enabled guest, our KVM is modern
enough and supports EVMCS (on Intel, of course). Also, nested state
(which is part of the migration stream) has EVMCS flag, we can't migrate
somewhere where the flag is unsupported.

Anyway, I feel we're walking in circles. I'm ready to just drop all
EVMCS related bits from this seies to get it merged. This will make the
part which you hate the most ("custom parsers") go away too. We can discuss
EVMCS to death after that and, when we finally decide that user
convenience is actually worth something, we can add 'hv-evmcs' to the
new 

Eduardo, do you see any way forward here? 

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-02-24 17:00                               ` Vitaly Kuznetsov
@ 2021-03-01 15:32                                 ` Igor Mammedov
  2021-03-01 16:22                                   ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-03-01 15:32 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, Eduardo Habkost, qemu-devel

On Wed, 24 Feb 2021 18:00:43 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Tue, 23 Feb 2021 19:08:42 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Igor Mammedov <imammedo@redhat.com> writes:
> >>   
> >> > On Tue, 23 Feb 2021 16:46:50 +0100
> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >> >    
> >> >> Igor Mammedov <imammedo@redhat.com> writes:
> >> >>     
> >> >> > On Mon, 22 Feb 2021 11:20:34 +0100
> >> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >> >> >      
> >> >> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> >> >> >>       
> >> >> >> > Igor Mammedov <imammedo@redhat.com> writes:
> >> >> >> >        
> >> >> >> >>> 
> >> >> >> >>> We need to distinguish because that would be sane.
> >> >> >> >>> 
> >> >> >> >>> Enlightened VMCS is an extension to VMX, it can't be used without
> >> >> >> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,        
> >> >> >> >> ...        
> >> >> >> >>> That bein said, if
> >> >> >> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
> >> >> >> >>> there is a problem with explicit enablement: what should
> >> >> >> >>> 
> >> >> >> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
> >> >> >> >>> sound sane to me.        
> >> >> >> >> based on above I'd error out is user asks for unsupported option
> >> >> >> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out        
> >> >> >> >
> >> >> >> > That's what I keep telling you but you don't seem to listen. 'Scratch
> >> >> >> > CPU' can't possibly help with this use-case because when you parse 
> >> >> >> >
> >> >> >> > 'hv-passthrough,hv-evmcs,vmx=off' you
> >> >> >> >
> >> >> >> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
> >> >> >> > host.
> >> >> >> >
> >> >> >> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
> >> >> >> >
> >> >> >> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
> >> >> >> >
> >> >> >> > We have to remember which options were aquired from the host and which
> >> >> >> > were set explicitly by the user.         
> >> >> >> 
> >> >> >> Igor,
> >> >> >> 
> >> >> >> could you please comment on the above? In case my line of thought is
> >> >> >> correct, and it is impossible to distinguish between e.g.
> >> >> >> 
> >> >> >> 'hv-passthrough,hv-evmcs,-vmx'
> >> >> >> and
> >> >> >> 'hv-passthrough,-vmx'
> >> >> >> 
> >> >> >> without a custom parser (written just exactly the way I did in this
> >> >> >> version, for example) regardless of when 'hv-passthrough' is
> >> >> >> expanded. E.g. we have the exact same problem with
> >> >> >> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing      
> >> >> >
> >> >> > right, if we need to distinguish between explicit and implicit hv-evmcs set by
> >> >> > hv-passthrough custom parser probably the way to go.
> >> >> >
> >> >> > However do we need actually need to do it?      
> >> >> 
> >> >> I think we really need that. See below ...
> >> >>     
> >> >> > I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
> >> >> > and it applies not only hv-evmcs but other features hv-passthrough might set
> >> >> > (i.e. if whatever was [un]set by hv-passthrough in combination with other
> >> >> > features results in invalid config, QEMU shall error out instead of magically
> >> >> > altering host provided hv-passthrough value).
> >> >> >
> >> >> > something like:
> >> >> >   'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
> >> >> > should result in
> >> >> >   error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
> >> >> >                  " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"
> >> >> >
> >> >> > making host's features set, *magically* mutable, depending on other user provided features
> >> >> > is a bit confusing. One would never know what hv-passthrough actually means, and if
> >> >> > enabling/disabling 'random' feature changes it.
> >> >> >
> >> >> > It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
> >> >> > in case it ends up in nonsense configuration.
> >> >> >      
> >> >> 
> >> >> I don't seem to agree this is a sane behavior, especially if you replace
> >> >> 'hv-passthrough' with 'hv-default' above. Removing 'vmx' from CPU for
> >> >> Windows guests is common if you'd want to avoid nested configuration:
> >> >> even without any Hyper-V guests created, Windows itself is a Hyper-V
> >> >> partition.
> >> >> 
> >> >> So a sane user will do:
> >> >> 
> >> >> '-cpu host,hv-default,vmx=off' 
> >> >> 
> >> >> and on Intel he will get an error, and on AMD he won't. 
> >> >> 
> >> >> So what you're suggesting actually defeats the whole purpose of
> >> >> 'hv-default' as upper-layer tools (think libvirt) will need to know that    
> >> > I'd assume it would be hard for libvirt to use 'hv-default' from migration
> >> > point of view. It's semi opaque (one can find out what features it sets
> >> > indirectly inspecting individual hv_foo features, and mgmt will need to
> >> > know about them). If it will mutate when other features [un]set, upper
> >> > layers might need to enumerate all these permutations to know which hosts
> >> > are compatible or compare host feature sets every time before attempting
> >> > migration.  
> >> 
> >> That's exactly the opposite of what's the goal here which is: make it
> >> possible for upper layers to not know anything about Hyper-V
> >> enlightenments besides 'hv-default'. Migration should work just fine, if
> >> the rest of guest configuration matches -- then 'hv-default' will create
> >> the exact same things (e.g. if 'vmx' was disabled on the source it has  
> >             ^^^^^
> > I'm not convinced in that yet (not with current impl. more on that at the end of reply)
> >  
> >> to be enabled on the destination, it can't be different)
> >> 
> >>   
> >> >> Intel configurations for Windows guests are somewhat different. They'll
> >> >> need to know what 'hv-evmcs' is. We're back to where we've started.    
> >> >
> >> > we were talking about hv-passthrough, and if host advertises hv-evmcs
> >> > QEMU should complain if user disabled features it depends on (
> >> > not silently fixing up configuration error).
> >> > But the same applies to hv-default.    
> >> 
> >> Let's forget about hv-passthrough completely for a while as this series
> >> is kind of unrelated to it.  
> >
> > It adds a lot for unrelated code (not just couple of lines),
> > I've played with scratch CPU idea, here is demo of it
> > https://github.com/imammedo/qemu/commit/a4b107d5368ebf72d45082bc8310a6b88a4ba6fb
> > I didn't rework caps/cpuid querying parts (just hacked around it),
> > and even without that it saves us ~200LOC (not a small part of which comes
> > with this series).  
> 
> All your savings come from throwing away custom parsers -- which are not
> needed at all if we don't distinguish between
> 
> 'hv-default,hv-evmcs' and 'hv-default'
> 
> it's just not needed, don't count these patches in. Or, if it is needed,
> please explain how your scratch CPU is making things different. I guess
> it is not so we can discuss this outside of this series.

scratch CPU helps with hostpassthrough refactoring which also brings in
dependency on custom parser.


> > I also split horrible hv_cpuid_check_and_set into separate 'set' and 'check' stages.
> > Granted it was sort-of pre-existing ugly code, some of your
> > re-factoring made it a bit better but it's still far from readable.
> >    
> 
> hv_cpuid_check_and_set() is already there, I'm not at all opposed to
> making this code even better but I don't see it as a must for this
> particular feature (hv-default).
it's not a must have for hv-default especially if you don't touch it.
(however if you touch it & co, I'd ask to clean it up first)


> >> In the previous submission I was setting 'hv-default' based on host
> >> availability of the feature only. That is: set on Intel, unset on
> >> AMD. We have to at least preserve that because it would be insane to
> >> crash on
> >> 
> >> -cpu host,hv-default 
> >> 
> >> on AMD because AMD doesn't (and never will!) support hv-evmcs, right?  
> >
> > If QEMU prevents cross arch migration i.e. it's not supported,
> > then I guess we can make hv-default different depending on AMD or Intel host.
> > If not then we might need to be conservative i.e. exclude hv-evmcs from defaults.
> >  
> 
> Forget about cross vendor. I want to tie it 1:1 to VMX feature on the
> guest CPU -- which happens to be only available on Intel. It is
> absolutely impossible to migrate VMX enabled guest to VMX-disabled
> destination, with or without evmcs.
Ok

> >> >> If we are to follow this approach let's just throw away 'hv-evmcs' from
> >> >> 'hv-default' set, it's going to be much cleaner. But again, I don't
> >> >> really believe it's the right way to go.    
> >> >
> >> > if desired behavior, on Intel host for above config, to start without error
> >> > then indeed defaults should not set 'hv-evmcs' if it results in invalid
> >> > feature set.    
> >> 
> >> This is problematic as it is still sane for everyone to enable it as it
> >> gives performance advantage. If we just for a second forget about custom  
> >   "  
> >     > >> So a sane user will do:
> >     > >> 
> >     > >> '-cpu host,hv-default,vmx=off'  
> >   "
> >   it's not easy picking defaults.
> >  
> >> parsers and all that -- which is just an implementation detail, why can't
> >> we tie 'hv-evmcs' bit in 'hv-default' to 'vxm' 1:1?  
> > migration wise I don't see issues wrt vmx=off turning of hv-evmcs,
> > however ...
> >
> > we were replacing user input fixups with hard errors asking
> > user to fix CLI and removing custom parsers in favor of generic ones.
> >
> > In vmx=off case we would be fixing up what 'hv-default' explicitly set.
> > Same applies to other hv-foo set by hv-default.
> >
> > ex: 'hv-default,hv-dep1=off', will turn off some dependent feature
> > for other hv feature in hv-default set and it will error out,
> > same goes on for enabling feature that has dependencies.
> > Why should we treat hv-evmcs/vmx pair any different?
> >  
> 
> VMX is not part of Hyper-V enlightenments, is it? It can also be coming
> from a CPU model:
> 
> "-cpu MyLovelyModelWithVmx,hv-default"
> 
> should not throw an error!
> 
> Again, the goal is for userspace to not know anything besides
> 'hv-default' for Hyper-V enlightenments.
> 
> 
> > Granted exiting with error is not the best UX, but at least it says to user
> > what's wrong with CLI and how to fix it. Also it lets to keep QEMU code
> > manageable and with consistent behavior.  
> 
> Enabling EVMCS only when guest CPU has VMX is a smart behavior, all
> users want that. It is very consistent with how genuine Hyper-V behaves.
it's all good,
unless VMX is absent for whatever reasons (cpumodel or vmx=off on CLI),
in this case just error out and say what's wrong instead of trying to fix
CLI up.

> >> Again, the end goal is: make it possible for upper layers to now know
> >> anything about Hyper-V enlightenments other than 'hv-default'.  
> > I'm still doubtful about feasibility of this goal when migration is considered.
> > It sure would work if hosts are identical hw/sw wise.
> > In mixed setup all features, except of hv-evmcs, that included in 'hv-default',
> > will error out in case host doesn't support it, which should prevent
> > incompatible migration so that's also fine.
> >
> > But hv-evmcs will silently go away if host doesn't support it,
> > which is issue when migration happens to/from host that supports it.  
> 
> But how can it go away? In KVM, hv-evmcs support is not conditional,
> basically, all KVMs which support netsting state migration also support
> EVMCS. 

you added that in kernel between 4.19-4.20 (8cab6507f64),
so on Intel host flag can change depending on not so old kernel version.

Unless QEMU minimum supported kernel version is 4.20,
we can't ignore that.
(downstreams will have to take care of it on its own)

> >
> > Maybe to help mgmt to figure out hosts compatibility
> >   1. it should know about hv-evmcs to query it's status
> >   2. or default value set by 'hv-default' should be exposed to mgmt
> >      so it could compare whole feature-set in one go without being
> >      aware of individual features.
> >
> > Additionally on QEMU side for such conditional features we can
> > theoretically add a subsection to migration stream when feature
> > is enabled, that way we at least can prevent 'successful' migration,
> > when destination value doesn't match. But this might already be
> > over-engineering on my part.  
> 
> I think you're trying to solve an issue which doesn't exist. In case
> we're successfully migrating a nested enabled guest, our KVM is modern
> enough and supports EVMCS (on Intel, of course). Also, nested state
> (which is part of the migration stream) has EVMCS flag, we can't migrate
> somewhere where the flag is unsupported.
that's what I was missing.
Can you point out to the code that makes sure that migration fails?
(a comment where hv-evcms is added to default set explaining why it's safe, pls)

 
> Anyway, I feel we're walking in circles. I'm ready to just drop all
> EVMCS related bits from this seies to get it merged. This will make the
> part which you hate the most ("custom parsers") go away too. We can discuss
> EVMCS to death after that and, when we finally decide that user
> convenience is actually worth something, we can add 'hv-evmcs' to the
> new 

fine with me, it would be even easier to review if it were just a patch that
would add 'hv-defaults', without non must have refactoring.
(cleanup/refactoring could be another series)

If we can guarantee that hv-evcms won't flip on/off on all supported kernels,
I'm also fine with keeping it in default set and error-ing out if vmx ends up
in off state.
However if it changes, we need to expose 'default set' to mgmt somehow,
so it will know that hosts aren't compatible (instead of finding out it hard way
in form of failed migration (assuming it fails))

> Eduardo, do you see any way forward here? 
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement
  2021-03-01 15:32                                 ` Igor Mammedov
@ 2021-03-01 16:22                                   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2021-03-01 16:22 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, drjones, Marcelo Tosatti, Eduardo Habkost, qemu-devel

Igor Mammedov <imammedo@redhat.com> writes:

> On Wed, 24 Feb 2021 18:00:43 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Tue, 23 Feb 2021 19:08:42 +0100
>> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >  
>> >> Igor Mammedov <imammedo@redhat.com> writes:
>> >>   
>> >> > On Tue, 23 Feb 2021 16:46:50 +0100
>> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >> >    
>> >> >> Igor Mammedov <imammedo@redhat.com> writes:
>> >> >>     
>> >> >> > On Mon, 22 Feb 2021 11:20:34 +0100
>> >> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >> >> >      
>> >> >> >> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
>> >> >> >>       
>> >> >> >> > Igor Mammedov <imammedo@redhat.com> writes:
>> >> >> >> >        
>> >> >> >> >>> 
>> >> >> >> >>> We need to distinguish because that would be sane.
>> >> >> >> >>> 
>> >> >> >> >>> Enlightened VMCS is an extension to VMX, it can't be used without
>> >> >> >> >>> it. Genuine Hyper-V doesn't have a knob for enabling and disabling it,        
>> >> >> >> >> ...        
>> >> >> >> >>> That bein said, if
>> >> >> >> >>> guest CPU lacks VMX it is counter-productive to expose EVMCS. However,
>> >> >> >> >>> there is a problem with explicit enablement: what should
>> >> >> >> >>> 
>> >> >> >> >>> 'hv-passthrough,hv-evmcs' option do? Just silently drop EVMCS? Doesn't
>> >> >> >> >>> sound sane to me.        
>> >> >> >> >> based on above I'd error out is user asks for unsupported option
>> >> >> >> >> i.e. no VMX -> no hv-evmcs - if explicitly asked -> error out        
>> >> >> >> >
>> >> >> >> > That's what I keep telling you but you don't seem to listen. 'Scratch
>> >> >> >> > CPU' can't possibly help with this use-case because when you parse 
>> >> >> >> >
>> >> >> >> > 'hv-passthrough,hv-evmcs,vmx=off' you
>> >> >> >> >
>> >> >> >> > 1) "hv-passthrough" -> set EVMCS bit to '1' as it is supported by the
>> >> >> >> > host.
>> >> >> >> >
>> >> >> >> > 2) 'hv-evmcs' -> keep EVMCS bit '1'
>> >> >> >> >
>> >> >> >> > 3) 'vmx=off' -> you have no idea where EVMCS bit came from.
>> >> >> >> >
>> >> >> >> > We have to remember which options were aquired from the host and which
>> >> >> >> > were set explicitly by the user.         
>> >> >> >> 
>> >> >> >> Igor,
>> >> >> >> 
>> >> >> >> could you please comment on the above? In case my line of thought is
>> >> >> >> correct, and it is impossible to distinguish between e.g.
>> >> >> >> 
>> >> >> >> 'hv-passthrough,hv-evmcs,-vmx'
>> >> >> >> and
>> >> >> >> 'hv-passthrough,-vmx'
>> >> >> >> 
>> >> >> >> without a custom parser (written just exactly the way I did in this
>> >> >> >> version, for example) regardless of when 'hv-passthrough' is
>> >> >> >> expanded. E.g. we have the exact same problem with
>> >> >> >> 'hv-default,hv-evmcs,-vmx'. I that case I see no point in discussing      
>> >> >> >
>> >> >> > right, if we need to distinguish between explicit and implicit hv-evmcs set by
>> >> >> > hv-passthrough custom parser probably the way to go.
>> >> >> >
>> >> >> > However do we need actually need to do it?      
>> >> >> 
>> >> >> I think we really need that. See below ...
>> >> >>     
>> >> >> > I'd treat 'hv-passthrough,-vmx' the same way as 'hv-passthrough,hv-evmcs,-vmx'
>> >> >> > and it applies not only hv-evmcs but other features hv-passthrough might set
>> >> >> > (i.e. if whatever was [un]set by hv-passthrough in combination with other
>> >> >> > features results in invalid config, QEMU shall error out instead of magically
>> >> >> > altering host provided hv-passthrough value).
>> >> >> >
>> >> >> > something like:
>> >> >> >   'hv-passthrough,-vmx' when hv-passthrough makes hv-evmcs bit set
>> >> >> > should result in
>> >> >> >   error_setg(errp,"'vmx' feature can't be disabled when hv-evmcs is enabled,"
>> >> >> >                  " either enable 'vmx' or disable 'hv-evmcs' along with disabling 'vmx'"
>> >> >> >
>> >> >> > making host's features set, *magically* mutable, depending on other user provided features
>> >> >> > is a bit confusing. One would never know what hv-passthrough actually means, and if
>> >> >> > enabling/disabling 'random' feature changes it.
>> >> >> >
>> >> >> > It's cleaner to do just what user asked (whether implicitly or explicitly) and error out
>> >> >> > in case it ends up in nonsense configuration.
>> >> >> >      
>> >> >> 
>> >> >> I don't seem to agree this is a sane behavior, especially if you replace
>> >> >> 'hv-passthrough' with 'hv-default' above. Removing 'vmx' from CPU for
>> >> >> Windows guests is common if you'd want to avoid nested configuration:
>> >> >> even without any Hyper-V guests created, Windows itself is a Hyper-V
>> >> >> partition.
>> >> >> 
>> >> >> So a sane user will do:
>> >> >> 
>> >> >> '-cpu host,hv-default,vmx=off' 
>> >> >> 
>> >> >> and on Intel he will get an error, and on AMD he won't. 
>> >> >> 
>> >> >> So what you're suggesting actually defeats the whole purpose of
>> >> >> 'hv-default' as upper-layer tools (think libvirt) will need to know that    
>> >> > I'd assume it would be hard for libvirt to use 'hv-default' from migration
>> >> > point of view. It's semi opaque (one can find out what features it sets
>> >> > indirectly inspecting individual hv_foo features, and mgmt will need to
>> >> > know about them). If it will mutate when other features [un]set, upper
>> >> > layers might need to enumerate all these permutations to know which hosts
>> >> > are compatible or compare host feature sets every time before attempting
>> >> > migration.  
>> >> 
>> >> That's exactly the opposite of what's the goal here which is: make it
>> >> possible for upper layers to not know anything about Hyper-V
>> >> enlightenments besides 'hv-default'. Migration should work just fine, if
>> >> the rest of guest configuration matches -- then 'hv-default' will create
>> >> the exact same things (e.g. if 'vmx' was disabled on the source it has  
>> >             ^^^^^
>> > I'm not convinced in that yet (not with current impl. more on that at the end of reply)
>> >  
>> >> to be enabled on the destination, it can't be different)
>> >> 
>> >>   
>> >> >> Intel configurations for Windows guests are somewhat different. They'll
>> >> >> need to know what 'hv-evmcs' is. We're back to where we've started.    
>> >> >
>> >> > we were talking about hv-passthrough, and if host advertises hv-evmcs
>> >> > QEMU should complain if user disabled features it depends on (
>> >> > not silently fixing up configuration error).
>> >> > But the same applies to hv-default.    
>> >> 
>> >> Let's forget about hv-passthrough completely for a while as this series
>> >> is kind of unrelated to it.  
>> >
>> > It adds a lot for unrelated code (not just couple of lines),
>> > I've played with scratch CPU idea, here is demo of it
>> > https://github.com/imammedo/qemu/commit/a4b107d5368ebf72d45082bc8310a6b88a4ba6fb
>> > I didn't rework caps/cpuid querying parts (just hacked around it),
>> > and even without that it saves us ~200LOC (not a small part of which comes
>> > with this series).  
>> 
>> All your savings come from throwing away custom parsers -- which are not
>> needed at all if we don't distinguish between
>> 
>> 'hv-default,hv-evmcs' and 'hv-default'
>> 
>> it's just not needed, don't count these patches in. Or, if it is needed,
>> please explain how your scratch CPU is making things different. I guess
>> it is not so we can discuss this outside of this series.
>
> scratch CPU helps with hostpassthrough refactoring which also brings in
> dependency on custom parser.

I think I already said that but let me repeat myself: scratch CPUs don't
help us in any way to distinguish between

'hv-passthrough,hv-evmcs,vmx=off' 
and
'hv-passthrough,vmx=off' 

and thus are utterly useless (for this particular feature). It's not
'refactoring', it's throwing a feature away for the sake of code
purity.

>
>
>> > I also split horrible hv_cpuid_check_and_set into separate 'set' and 'check' stages.
>> > Granted it was sort-of pre-existing ugly code, some of your
>> > re-factoring made it a bit better but it's still far from readable.
>> >    
>> 
>> hv_cpuid_check_and_set() is already there, I'm not at all opposed to
>> making this code even better but I don't see it as a must for this
>> particular feature (hv-default).
> it's not a must have for hv-default especially if you don't touch it.
> (however if you touch it & co, I'd ask to clean it up first)
>
>
>> >> In the previous submission I was setting 'hv-default' based on host
>> >> availability of the feature only. That is: set on Intel, unset on
>> >> AMD. We have to at least preserve that because it would be insane to
>> >> crash on
>> >> 
>> >> -cpu host,hv-default 
>> >> 
>> >> on AMD because AMD doesn't (and never will!) support hv-evmcs, right?  
>> >
>> > If QEMU prevents cross arch migration i.e. it's not supported,
>> > then I guess we can make hv-default different depending on AMD or Intel host.
>> > If not then we might need to be conservative i.e. exclude hv-evmcs from defaults.
>> >  
>> 
>> Forget about cross vendor. I want to tie it 1:1 to VMX feature on the
>> guest CPU -- which happens to be only available on Intel. It is
>> absolutely impossible to migrate VMX enabled guest to VMX-disabled
>> destination, with or without evmcs.
> Ok
>
>> >> >> If we are to follow this approach let's just throw away 'hv-evmcs' from
>> >> >> 'hv-default' set, it's going to be much cleaner. But again, I don't
>> >> >> really believe it's the right way to go.    
>> >> >
>> >> > if desired behavior, on Intel host for above config, to start without error
>> >> > then indeed defaults should not set 'hv-evmcs' if it results in invalid
>> >> > feature set.    
>> >> 
>> >> This is problematic as it is still sane for everyone to enable it as it
>> >> gives performance advantage. If we just for a second forget about custom  
>> >   "  
>> >     > >> So a sane user will do:
>> >     > >> 
>> >     > >> '-cpu host,hv-default,vmx=off'  
>> >   "
>> >   it's not easy picking defaults.
>> >  
>> >> parsers and all that -- which is just an implementation detail, why can't
>> >> we tie 'hv-evmcs' bit in 'hv-default' to 'vxm' 1:1?  
>> > migration wise I don't see issues wrt vmx=off turning of hv-evmcs,
>> > however ...
>> >
>> > we were replacing user input fixups with hard errors asking
>> > user to fix CLI and removing custom parsers in favor of generic ones.
>> >
>> > In vmx=off case we would be fixing up what 'hv-default' explicitly set.
>> > Same applies to other hv-foo set by hv-default.
>> >
>> > ex: 'hv-default,hv-dep1=off', will turn off some dependent feature
>> > for other hv feature in hv-default set and it will error out,
>> > same goes on for enabling feature that has dependencies.
>> > Why should we treat hv-evmcs/vmx pair any different?
>> >  
>> 
>> VMX is not part of Hyper-V enlightenments, is it? It can also be coming
>> from a CPU model:
>> 
>> "-cpu MyLovelyModelWithVmx,hv-default"
>> 
>> should not throw an error!
>> 
>> Again, the goal is for userspace to not know anything besides
>> 'hv-default' for Hyper-V enlightenments.
>> 
>> 
>> > Granted exiting with error is not the best UX, but at least it says to user
>> > what's wrong with CLI and how to fix it. Also it lets to keep QEMU code
>> > manageable and with consistent behavior.  
>> 
>> Enabling EVMCS only when guest CPU has VMX is a smart behavior, all
>> users want that. It is very consistent with how genuine Hyper-V behaves.
> it's all good,
> unless VMX is absent for whatever reasons (cpumodel or vmx=off on CLI),
> in this case just error out and say what's wrong instead of trying to fix
> CLI up.

It's easier, of course, but I don't think we should do that. At least 

query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}

should give upper layer the full list of supported features, including
evmcs, and not just throw an error. Same with 'hv-default'

>
>> >> Again, the end goal is: make it possible for upper layers to now know
>> >> anything about Hyper-V enlightenments other than 'hv-default'.  
>> > I'm still doubtful about feasibility of this goal when migration is considered.
>> > It sure would work if hosts are identical hw/sw wise.
>> > In mixed setup all features, except of hv-evmcs, that included in 'hv-default',
>> > will error out in case host doesn't support it, which should prevent
>> > incompatible migration so that's also fine.
>> >
>> > But hv-evmcs will silently go away if host doesn't support it,
>> > which is issue when migration happens to/from host that supports it.  
>> 
>> But how can it go away? In KVM, hv-evmcs support is not conditional,
>> basically, all KVMs which support netsting state migration also support
>> EVMCS. 
>
> you added that in kernel between 4.19-4.20 (8cab6507f64),
> so on Intel host flag can change depending on not so old kernel version.
>
> Unless QEMU minimum supported kernel version is 4.20,
> we can't ignore that.
> (downstreams will have to take care of it on its own)

You forget that EVMCS is not a feature in vacuum, it is an extension to
VMX. To migrate a guest which has it enabled, nested state migration has
to support it. Assuming the feature was enabled on the source, you need
at least the following:

commit 8cab6507f64eff0ccfea01fccbc7e3e05e2aaf7e
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Tue Oct 16 18:50:09 2018 +0200

    x86/kvm/nVMX: nested state migration for Enlightened VMCS

which is also in 4.20, I don't see how migration without it can even
succeed. 

>
>> >
>> > Maybe to help mgmt to figure out hosts compatibility
>> >   1. it should know about hv-evmcs to query it's status
>> >   2. or default value set by 'hv-default' should be exposed to mgmt
>> >      so it could compare whole feature-set in one go without being
>> >      aware of individual features.
>> >
>> > Additionally on QEMU side for such conditional features we can
>> > theoretically add a subsection to migration stream when feature
>> > is enabled, that way we at least can prevent 'successful' migration,
>> > when destination value doesn't match. But this might already be
>> > over-engineering on my part.  
>> 
>> I think you're trying to solve an issue which doesn't exist. In case
>> we're successfully migrating a nested enabled guest, our KVM is modern
>> enough and supports EVMCS (on Intel, of course). Also, nested state
>> (which is part of the migration stream) has EVMCS flag, we can't migrate
>> somewhere where the flag is unsupported.
> that's what I was missing.
> Can you point out to the code that makes sure that migration fails?
> (a comment where hv-evcms is added to default set explaining why it's safe, pls)
>

If you look at KVM_SET_NESTED_STATE implementation in KVM (see
kvm_arch_vcpu_ioctl()), it checks for supported flags

arch/x86/kvm/x86.c-             if (kvm_state.flags &
arch/x86/kvm/x86.c-                 ~(KVM_STATE_NESTED_RUN_PENDING | KVM_STATE_NESTED_GUEST_MODE
arch/x86/kvm/x86.c-                   | KVM_STATE_NESTED_EVMCS | KVM_STATE_NESTED_MTF_PENDING
arch/x86/kvm/x86.c-                   | KVM_STATE_NESTED_GIF_SET))
arch/x86/kvm/x86.c-                     break;

if KVM_STATE_NESTED_EVMCS is unsupported (KVM before 8cab6507f64) this
condition will fail.

In reality, regardless of evmcs, there are so many bugs in nested
migration prior to 5.11/5.10(maybe) that chances for successful
migration on these kernels are pretty slim. Worst is, the migration
won't fail, you guest (with its nested guests) will just crash.  I don't
recommend anyone doing nested migration in production with some older
kernel, it's just not safe.

>  
>> Anyway, I feel we're walking in circles. I'm ready to just drop all
>> EVMCS related bits from this seies to get it merged. This will make the
>> part which you hate the most ("custom parsers") go away too. We can discuss
>> EVMCS to death after that and, when we finally decide that user
>> convenience is actually worth something, we can add 'hv-evmcs' to the
>> new 
>
> fine with me, it would be even easier to review if it were just a patch that
> would add 'hv-defaults', without non must have refactoring.
> (cleanup/refactoring could be another series)
>

Ok, v5 is on the list.

> If we can guarantee that hv-evcms won't flip on/off on all supported kernels,
> I'm also fine with keeping it in default set and error-ing out if vmx ends up
> in off state.

I still don't see why and how it should (or could) flip.

> However if it changes, we need to expose 'default set' to mgmt somehow,
> so it will know that hosts aren't compatible (instead of finding out it hard way
> in form of failed migration (assuming it fails))

The first part of this series does exactly that: exposes Hyper-V
enlightenments to upper layers in QMP. Upper layer can do e.g. 

query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}

and it will get all the supported features, including evmcs. This is
one of the reasons I insist this should *always* work instead of
throwing an error (with 'hv-default too).

-- 
Vitaly



^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2021-03-01 16:23 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-10 16:40 [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 01/21] i386: keep hyperv_vendor string up-to-date Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 02/21] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 03/21] i386: always fill Hyper-V CPUID feature leaves from X86CPU data Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 04/21] i386: stop using env->features[] for filling Hyper-V CPUIDs Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 05/21] i386: introduce hyperv_feature_supported() Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 06/21] i386: introduce hv_cpuid_get_host() Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 07/21] i386: drop FEAT_HYPERV feature leaves Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 08/21] i386: introduce hv_cpuid_cache Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 09/21] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids() Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 10/21] i386: move eVMCS enablement to hyperv_init_vcpu() Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 11/21] i386: switch hyperv_expand_features() to using error_setg() Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 12/21] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 13/21] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 14/21] i386: use global kvm_state in hyperv_enabled() check Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 15/21] i386: expand Hyper-V features during CPU feature expansion time Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 16/21] i386: track explicit 'hv-*' features enablement/disablement Vitaly Kuznetsov
2021-02-11 17:35   ` Igor Mammedov
2021-02-12  8:45     ` Vitaly Kuznetsov
2021-02-12 14:12       ` Igor Mammedov
2021-02-12 15:19         ` Vitaly Kuznetsov
2021-02-12 15:26           ` Vitaly Kuznetsov
2021-02-12 16:05             ` Igor Mammedov
2021-02-15  8:56               ` Vitaly Kuznetsov
2021-02-15 15:55                 ` Igor Mammedov
2021-02-15 17:05                   ` Igor Mammedov
2021-02-15 18:12                   ` Vitaly Kuznetsov
2021-02-12 16:01           ` Igor Mammedov
2021-02-15  8:53             ` Vitaly Kuznetsov
2021-02-15 10:48               ` Andrew Jones
2021-02-15 17:01               ` Igor Mammedov
2021-02-15 18:11                 ` Vitaly Kuznetsov
2021-02-22 10:20                   ` Vitaly Kuznetsov
2021-02-23 15:19                     ` Igor Mammedov
2021-02-23 15:46                       ` Vitaly Kuznetsov
2021-02-23 17:48                         ` Igor Mammedov
2021-02-23 18:08                           ` Vitaly Kuznetsov
2021-02-24 16:06                             ` Igor Mammedov
2021-02-24 17:00                               ` Vitaly Kuznetsov
2021-03-01 15:32                                 ` Igor Mammedov
2021-03-01 16:22                                   ` Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 17/21] i386: support 'hv-passthrough, hv-feature=off' on the command line Vitaly Kuznetsov
2021-02-11 17:14   ` Igor Mammedov
2021-02-12  8:49     ` Vitaly Kuznetsov
2021-02-12  9:29       ` David Edmondson
2021-02-12 13:52       ` Igor Mammedov
2021-02-10 16:40 ` [PATCH v4 18/21] i386: be more picky about implicit 'hv-evmcs' enablement Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 19/21] i386: introduce kvm_hv_evmcs_available() Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 20/21] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
2021-02-11 17:23   ` Igor Mammedov
2021-02-12  8:52     ` Vitaly Kuznetsov
2021-02-10 16:40 ` [PATCH v4 21/21] qtest/hyperv: Introduce a simple hyper-v test Vitaly Kuznetsov
2021-02-10 16:56 ` [PATCH v4 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Daniel P. Berrangé
2021-02-10 17:46   ` Eduardo Habkost
2021-02-11  8:30     ` Vitaly Kuznetsov
2021-02-11  9:14       ` Daniel P. Berrangé
2021-02-11  9:34         ` Vitaly Kuznetsov
2021-02-11 10:14           ` Daniel P. Berrangé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.