All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/2] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes
@ 2018-03-20 17:34 Vitaly Kuznetsov
  2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
  2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
  0 siblings, 2 replies; 25+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-20 17:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Richard Henderson, Eduardo Habkost, Marcelo Tosatti, qemu-devel,
	Roman Kagan

Changes since v2:
- add hv-reenlightenment CPU property [Roman Kagan, Paolo Bonzini]
- add a comment to feature_word_info [Roman Kagan]

Previously, Ladi was working on enabling TSC page clocksource for nested
Hyper-V-on-KVM workloads. He found out that if Hyper-V frequency MSRs are
exposed to L1 as well as INVTSC flag Hyper-V enables TSC page clocksource
to its guests. Qemu doesn't pass INVTSC by default as it is a migration
blocker.

I found out there's a different way to make Hyper-V like us: expose
Reenlightenment MSRs to it. KVM doesn't fully support the feature as
we're still unable to migrate nested environments but rudimentary support
we have there (kvm/queue only currently) is enough.

Enable Hyper-V reenlightenment MSRs and expose frequency MSRs even without
INVTSC to make things work.

Vitaly Kuznetsov (2):
  i386/kvm: add support for Hyper-V reenlightenment MSRs
  i386/kvm: lower requirements for Hyper-V frequency MSRs exposure

 target/i386/cpu.c          |  4 +++-
 target/i386/cpu.h          |  4 ++++
 target/i386/hyperv-proto.h |  9 ++++++++-
 target/i386/kvm.c          | 41 +++++++++++++++++++++++++++++++++++++++--
 target/i386/machine.c      | 24 ++++++++++++++++++++++++
 5 files changed, 78 insertions(+), 4 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs
  2018-03-20 17:34 [Qemu-devel] [PATCH v3 0/2] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes Vitaly Kuznetsov
@ 2018-03-20 17:34 ` Vitaly Kuznetsov
  2018-03-20 18:32   ` Eduardo Habkost
                     ` (2 more replies)
  2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
  1 sibling, 3 replies; 25+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-20 17:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Richard Henderson, Eduardo Habkost, Marcelo Tosatti, qemu-devel,
	Roman Kagan

KVM recently gained support for Hyper-V Reenlightenment MSRs which are
required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
when INVTSC is not passed to it (and it is not passed by default in Qemu
as it effectively blocks migration).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
Changes since v2:
- add hv-reenlightenment CPU property [Roman Kagan, Paolo Bonzini]
- add a comment to feature_word_info [Roman Kagan]
---
 target/i386/cpu.c          |  4 +++-
 target/i386/cpu.h          |  4 ++++
 target/i386/hyperv-proto.h |  9 ++++++++-
 target/i386/kvm.c          | 39 ++++++++++++++++++++++++++++++++++++++-
 target/i386/machine.c      | 24 ++++++++++++++++++++++++
 5 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6bb4ce8719..02579f8234 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -407,7 +407,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             NULL /* hv_vpindex_access */, NULL /* hv_msr_reset_access */,
             NULL /* hv_msr_stats_access */, NULL /* hv_reftsc_access */,
             NULL /* hv_msr_idle_access */, NULL /* hv_msr_frequency_access */,
-            NULL, NULL, NULL, NULL,
+            NULL /* hv_msr_debug_access */, NULL /* hv_msr_reenlightenment_access */,
+            NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
@@ -4764,6 +4765,7 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_BOOL("hv-runtime", X86CPU, hyperv_runtime, false),
     DEFINE_PROP_BOOL("hv-synic", X86CPU, hyperv_synic, false),
     DEFINE_PROP_BOOL("hv-stimer", X86CPU, hyperv_stimer, false),
+    DEFINE_PROP_BOOL("hv-reenlightenment", X86CPU, hyperv_reenlightenment, false),
     DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
     DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
     DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 2e2bab5ff3..98eed72937 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1174,6 +1174,9 @@ typedef struct CPUX86State {
     uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
     uint64_t msr_hv_stimer_config[HV_STIMER_COUNT];
     uint64_t msr_hv_stimer_count[HV_STIMER_COUNT];
+    uint64_t msr_hv_reenlightenment_control;
+    uint64_t msr_hv_tsc_emulation_control;
+    uint64_t msr_hv_tsc_emulation_status;
 
     uint64_t msr_rtit_ctrl;
     uint64_t msr_rtit_status;
@@ -1296,6 +1299,7 @@ struct X86CPU {
     bool hyperv_runtime;
     bool hyperv_synic;
     bool hyperv_stimer;
+    bool hyperv_reenlightenment;
     bool check_cpuid;
     bool enforce_cpuid;
     bool expose_kvm;
diff --git a/target/i386/hyperv-proto.h b/target/i386/hyperv-proto.h
index cb4d7f2b7a..93352ebd2a 100644
--- a/target/i386/hyperv-proto.h
+++ b/target/i386/hyperv-proto.h
@@ -35,7 +35,7 @@
 #define HV_RESET_AVAILABLE           (1u << 7)
 #define HV_REFERENCE_TSC_AVAILABLE   (1u << 9)
 #define HV_ACCESS_FREQUENCY_MSRS     (1u << 11)
-
+#define HV_ACCESS_REENLIGHTENMENTS_CONTROL  (1u << 13)
 
 /*
  * HV_CPUID_FEATURES.EDX bits
@@ -129,6 +129,13 @@
 #define HV_X64_MSR_CRASH_CTL                    0x40000105
 #define HV_CRASH_CTL_NOTIFY                     (1ull << 63)
 
+/*
+ * Reenlightenment notification MSRs
+ */
+#define HV_X64_MSR_REENLIGHTENMENT_CONTROL      0x40000106
+#define HV_X64_MSR_TSC_EMULATION_CONTROL        0x40000107
+#define HV_X64_MSR_TSC_EMULATION_STATUS         0x40000108
+
 /*
  * Hypercall status code
  */
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index d23fff12f5..7d9f9ca0b1 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -90,6 +90,7 @@ static bool has_msr_hv_runtime;
 static bool has_msr_hv_synic;
 static bool has_msr_hv_stimer;
 static bool has_msr_hv_frequencies;
+static bool has_msr_hv_reenlightenment;
 static bool has_msr_xss;
 static bool has_msr_spec_ctrl;
 static bool has_msr_smi_count;
@@ -583,7 +584,8 @@ static bool hyperv_enabled(X86CPU *cpu)
             cpu->hyperv_vpindex ||
             cpu->hyperv_runtime ||
             cpu->hyperv_synic ||
-            cpu->hyperv_stimer);
+            cpu->hyperv_stimer ||
+            cpu->hyperv_reenlightenment);
 }
 
 static int kvm_arch_set_tsc_khz(CPUState *cs)
@@ -654,6 +656,14 @@ static int hyperv_handle_properties(CPUState *cs)
             env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
         }
     }
+    if (cpu->hyperv_reenlightenment) {
+        if (!has_msr_hv_reenlightenment) {
+            fprintf(stderr,
+                    "Hyper-V Reenlightenment is not supported by kernel\n");
+            return -ENOSYS;
+        }
+        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
+    }
     if (cpu->hyperv_crash && has_msr_hv_crash) {
         env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
     }
@@ -1185,6 +1195,9 @@ static int kvm_get_supported_msrs(KVMState *s)
                 case HV_X64_MSR_TSC_FREQUENCY:
                     has_msr_hv_frequencies = true;
                     break;
+                case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
+                    has_msr_hv_reenlightenment = true;
+                    break;
                 case MSR_IA32_SPEC_CTRL:
                     has_msr_spec_ctrl = true;
                     break;
@@ -1747,6 +1760,15 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
             if (cpu->hyperv_time) {
                 kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC,
                                   env->msr_hv_tsc);
+
+                if (has_msr_hv_reenlightenment) {
+                    kvm_msr_entry_add(cpu, HV_X64_MSR_REENLIGHTENMENT_CONTROL,
+                                      env->msr_hv_reenlightenment_control);
+                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_CONTROL,
+                                      env->msr_hv_tsc_emulation_control);
+                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
+                                      env->msr_hv_tsc_emulation_status);
+                }
             }
         }
         if (cpu->hyperv_vapic) {
@@ -2109,6 +2131,12 @@ static int kvm_get_msrs(X86CPU *cpu)
     }
     if (cpu->hyperv_time) {
         kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC, 0);
+
+        if (has_msr_hv_reenlightenment) {
+            kvm_msr_entry_add(cpu, HV_X64_MSR_REENLIGHTENMENT_CONTROL, 0);
+            kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_CONTROL, 0);
+            kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS, 0);
+        }
     }
     if (has_msr_hv_crash) {
         int j;
@@ -2367,6 +2395,15 @@ static int kvm_get_msrs(X86CPU *cpu)
             env->msr_hv_stimer_count[(index - HV_X64_MSR_STIMER0_COUNT)/2] =
                                 msrs[i].data;
             break;
+        case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
+            env->msr_hv_reenlightenment_control = msrs[i].data;
+            break;
+        case HV_X64_MSR_TSC_EMULATION_CONTROL:
+            env->msr_hv_tsc_emulation_control = msrs[i].data;
+            break;
+        case HV_X64_MSR_TSC_EMULATION_STATUS:
+            env->msr_hv_tsc_emulation_status = msrs[i].data;
+            break;
         case MSR_MTRRdefType:
             env->mtrr_deftype = msrs[i].data;
             break;
diff --git a/target/i386/machine.c b/target/i386/machine.c
index bd2d82e91b..fd99c0bbb4 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -713,6 +713,29 @@ static const VMStateDescription vmstate_msr_hyperv_stimer = {
     }
 };
 
+static bool hyperv_reenlightenment_enable_needed(void *opaque)
+{
+    X86CPU *cpu = opaque;
+    CPUX86State *env = &cpu->env;
+
+    return env->msr_hv_reenlightenment_control != 0 ||
+        env->msr_hv_tsc_emulation_control != 0 ||
+        env->msr_hv_tsc_emulation_status != 0;
+}
+
+static const VMStateDescription vmstate_msr_hyperv_reenlightenment = {
+    .name = "cpu/msr_hyperv_reenlightenment",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = hyperv_reenlightenment_enable_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT64(env.msr_hv_reenlightenment_control, X86CPU),
+        VMSTATE_UINT64(env.msr_hv_tsc_emulation_control, X86CPU),
+        VMSTATE_UINT64(env.msr_hv_tsc_emulation_status, X86CPU),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static bool avx512_needed(void *opaque)
 {
     X86CPU *cpu = opaque;
@@ -1005,6 +1028,7 @@ VMStateDescription vmstate_x86_cpu = {
         &vmstate_msr_hyperv_runtime,
         &vmstate_msr_hyperv_synic,
         &vmstate_msr_hyperv_stimer,
+        &vmstate_msr_hyperv_reenlightenment,
         &vmstate_avx512,
         &vmstate_xss,
         &vmstate_tsc_khz,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-20 17:34 [Qemu-devel] [PATCH v3 0/2] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes Vitaly Kuznetsov
  2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
@ 2018-03-20 17:35 ` Vitaly Kuznetsov
  2018-03-21 12:10   ` Roman Kagan
  2018-03-21 15:33   ` Paolo Bonzini
  1 sibling, 2 replies; 25+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-20 17:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Richard Henderson, Eduardo Habkost, Marcelo Tosatti, qemu-devel,
	Roman Kagan

Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
these frequencies are stable, it just means they're available for reading.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 7d9f9ca0b1..74fc3d3b2c 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
         env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
         env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
 
-        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
+        if (has_msr_hv_frequencies && env->tsc_khz) {
             env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
             env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
         }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs
  2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
@ 2018-03-20 18:32   ` Eduardo Habkost
  2018-03-21 11:09     ` Roman Kagan
  2018-03-21 13:09     ` Roman Kagan
  2018-03-21 11:24   ` Roman Kagan
  2018-03-22 17:09   ` Marcelo Tosatti
  2 siblings, 2 replies; 25+ messages in thread
From: Eduardo Habkost @ 2018-03-20 18:32 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Richard Henderson, Marcelo Tosatti, qemu-devel,
	Roman Kagan

On Tue, Mar 20, 2018 at 06:34:59PM +0100, Vitaly Kuznetsov wrote:
> KVM recently gained support for Hyper-V Reenlightenment MSRs which are
> required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
> when INVTSC is not passed to it (and it is not passed by default in Qemu
> as it effectively blocks migration).
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> Changes since v2:
> - add hv-reenlightenment CPU property [Roman Kagan, Paolo Bonzini]
> - add a comment to feature_word_info [Roman Kagan]
> ---
>  target/i386/cpu.c          |  4 +++-
>  target/i386/cpu.h          |  4 ++++
>  target/i386/hyperv-proto.h |  9 ++++++++-
>  target/i386/kvm.c          | 39 ++++++++++++++++++++++++++++++++++++++-
>  target/i386/machine.c      | 24 ++++++++++++++++++++++++
>  5 files changed, 77 insertions(+), 3 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 6bb4ce8719..02579f8234 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -407,7 +407,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>              NULL /* hv_vpindex_access */, NULL /* hv_msr_reset_access */,
>              NULL /* hv_msr_stats_access */, NULL /* hv_reftsc_access */,
>              NULL /* hv_msr_idle_access */, NULL /* hv_msr_frequency_access */,
> -            NULL, NULL, NULL, NULL,
> +            NULL /* hv_msr_debug_access */, NULL /* hv_msr_reenlightenment_access */,
> +            NULL, NULL,
>              NULL, NULL, NULL, NULL,
>              NULL, NULL, NULL, NULL,
>              NULL, NULL, NULL, NULL,
> @@ -4764,6 +4765,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("hv-runtime", X86CPU, hyperv_runtime, false),
>      DEFINE_PROP_BOOL("hv-synic", X86CPU, hyperv_synic, false),
>      DEFINE_PROP_BOOL("hv-stimer", X86CPU, hyperv_stimer, false),
> +    DEFINE_PROP_BOOL("hv-reenlightenment", X86CPU, hyperv_reenlightenment, false),

Property is set to false by default, so compatibility is kept.

>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 2e2bab5ff3..98eed72937 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1174,6 +1174,9 @@ typedef struct CPUX86State {
>      uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
>      uint64_t msr_hv_stimer_config[HV_STIMER_COUNT];
>      uint64_t msr_hv_stimer_count[HV_STIMER_COUNT];
> +    uint64_t msr_hv_reenlightenment_control;
> +    uint64_t msr_hv_tsc_emulation_control;
> +    uint64_t msr_hv_tsc_emulation_status;
>  
>      uint64_t msr_rtit_ctrl;
>      uint64_t msr_rtit_status;
> @@ -1296,6 +1299,7 @@ struct X86CPU {
>      bool hyperv_runtime;
>      bool hyperv_synic;
>      bool hyperv_stimer;
> +    bool hyperv_reenlightenment;
>      bool check_cpuid;
>      bool enforce_cpuid;
>      bool expose_kvm;
> diff --git a/target/i386/hyperv-proto.h b/target/i386/hyperv-proto.h
> index cb4d7f2b7a..93352ebd2a 100644
> --- a/target/i386/hyperv-proto.h
> +++ b/target/i386/hyperv-proto.h
> @@ -35,7 +35,7 @@
>  #define HV_RESET_AVAILABLE           (1u << 7)
>  #define HV_REFERENCE_TSC_AVAILABLE   (1u << 9)
>  #define HV_ACCESS_FREQUENCY_MSRS     (1u << 11)
> -
> +#define HV_ACCESS_REENLIGHTENMENTS_CONTROL  (1u << 13)
>  
>  /*
>   * HV_CPUID_FEATURES.EDX bits
> @@ -129,6 +129,13 @@
>  #define HV_X64_MSR_CRASH_CTL                    0x40000105
>  #define HV_CRASH_CTL_NOTIFY                     (1ull << 63)
>  
> +/*
> + * Reenlightenment notification MSRs
> + */
> +#define HV_X64_MSR_REENLIGHTENMENT_CONTROL      0x40000106
> +#define HV_X64_MSR_TSC_EMULATION_CONTROL        0x40000107
> +#define HV_X64_MSR_TSC_EMULATION_STATUS         0x40000108
> +
>  /*
>   * Hypercall status code
>   */
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index d23fff12f5..7d9f9ca0b1 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -90,6 +90,7 @@ static bool has_msr_hv_runtime;
>  static bool has_msr_hv_synic;
>  static bool has_msr_hv_stimer;
>  static bool has_msr_hv_frequencies;
> +static bool has_msr_hv_reenlightenment;
>  static bool has_msr_xss;
>  static bool has_msr_spec_ctrl;
>  static bool has_msr_smi_count;
> @@ -583,7 +584,8 @@ static bool hyperv_enabled(X86CPU *cpu)
>              cpu->hyperv_vpindex ||
>              cpu->hyperv_runtime ||
>              cpu->hyperv_synic ||
> -            cpu->hyperv_stimer);
> +            cpu->hyperv_stimer ||
> +            cpu->hyperv_reenlightenment);
>  }
>  
>  static int kvm_arch_set_tsc_khz(CPUState *cs)
> @@ -654,6 +656,14 @@ static int hyperv_handle_properties(CPUState *cs)
>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
>          }
>      }
> +    if (cpu->hyperv_reenlightenment) {
> +        if (!has_msr_hv_reenlightenment) {
> +            fprintf(stderr,
> +                    "Hyper-V Reenlightenment is not supported by kernel\n");
> +            return -ENOSYS;
> +        }
> +        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
> +    }
>      if (cpu->hyperv_crash && has_msr_hv_crash) {
>          env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
>      }
> @@ -1185,6 +1195,9 @@ static int kvm_get_supported_msrs(KVMState *s)
>                  case HV_X64_MSR_TSC_FREQUENCY:
>                      has_msr_hv_frequencies = true;
>                      break;
> +                case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
> +                    has_msr_hv_reenlightenment = true;
> +                    break;
>                  case MSR_IA32_SPEC_CTRL:
>                      has_msr_spec_ctrl = true;
>                      break;
> @@ -1747,6 +1760,15 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>              if (cpu->hyperv_time) {
>                  kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC,
>                                    env->msr_hv_tsc);
> +
> +                if (has_msr_hv_reenlightenment) {

I see that the current code is inconsistent: some entries check
for has_msr_hv_*, other entries check cpu->hyperv_*.

I suggest changing all of them (including this one) to check
cpu->hyperv_* instead.

The difference between both approaches is that checking just
has_msr_hv_* would let a non-cooperating guest prevent itself
from being migrated to an older host by writing a non-zero value
to a MSR, even if hyperv support was not enabled in the VM
configuration at all.  I don't think we want that.


> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_REENLIGHTENMENT_CONTROL,
> +                                      env->msr_hv_reenlightenment_control);
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_CONTROL,
> +                                      env->msr_hv_tsc_emulation_control);
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
> +                                      env->msr_hv_tsc_emulation_status);

The 3 MSRs are added by the same KVM commit, so setting all 3
based on the same has_msr_hv_*/cpu->hyperv_* flag is OK.

The rest of the patch looks good to me.


> +                }
>              }
>          }
>          if (cpu->hyperv_vapic) {
> @@ -2109,6 +2131,12 @@ static int kvm_get_msrs(X86CPU *cpu)
>      }
>      if (cpu->hyperv_time) {
>          kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC, 0);
> +
> +        if (has_msr_hv_reenlightenment) {
> +            kvm_msr_entry_add(cpu, HV_X64_MSR_REENLIGHTENMENT_CONTROL, 0);
> +            kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_CONTROL, 0);
> +            kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS, 0);
> +        }
>      }
>      if (has_msr_hv_crash) {
>          int j;
> @@ -2367,6 +2395,15 @@ static int kvm_get_msrs(X86CPU *cpu)
>              env->msr_hv_stimer_count[(index - HV_X64_MSR_STIMER0_COUNT)/2] =
>                                  msrs[i].data;
>              break;
> +        case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
> +            env->msr_hv_reenlightenment_control = msrs[i].data;
> +            break;
> +        case HV_X64_MSR_TSC_EMULATION_CONTROL:
> +            env->msr_hv_tsc_emulation_control = msrs[i].data;
> +            break;
> +        case HV_X64_MSR_TSC_EMULATION_STATUS:
> +            env->msr_hv_tsc_emulation_status = msrs[i].data;
> +            break;
>          case MSR_MTRRdefType:
>              env->mtrr_deftype = msrs[i].data;
>              break;
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index bd2d82e91b..fd99c0bbb4 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -713,6 +713,29 @@ static const VMStateDescription vmstate_msr_hyperv_stimer = {
>      }
>  };
>  
> +static bool hyperv_reenlightenment_enable_needed(void *opaque)
> +{
> +    X86CPU *cpu = opaque;
> +    CPUX86State *env = &cpu->env;
> +
> +    return env->msr_hv_reenlightenment_control != 0 ||
> +        env->msr_hv_tsc_emulation_control != 0 ||
> +        env->msr_hv_tsc_emulation_status != 0;
> +}
> +
> +static const VMStateDescription vmstate_msr_hyperv_reenlightenment = {
> +    .name = "cpu/msr_hyperv_reenlightenment",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = hyperv_reenlightenment_enable_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT64(env.msr_hv_reenlightenment_control, X86CPU),
> +        VMSTATE_UINT64(env.msr_hv_tsc_emulation_control, X86CPU),
> +        VMSTATE_UINT64(env.msr_hv_tsc_emulation_status, X86CPU),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static bool avx512_needed(void *opaque)
>  {
>      X86CPU *cpu = opaque;
> @@ -1005,6 +1028,7 @@ VMStateDescription vmstate_x86_cpu = {
>          &vmstate_msr_hyperv_runtime,
>          &vmstate_msr_hyperv_synic,
>          &vmstate_msr_hyperv_stimer,
> +        &vmstate_msr_hyperv_reenlightenment,
>          &vmstate_avx512,
>          &vmstate_xss,
>          &vmstate_tsc_khz,
> -- 
> 2.14.3
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs
  2018-03-20 18:32   ` Eduardo Habkost
@ 2018-03-21 11:09     ` Roman Kagan
  2018-03-21 13:09     ` Roman Kagan
  1 sibling, 0 replies; 25+ messages in thread
From: Roman Kagan @ 2018-03-21 11:09 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Tue, Mar 20, 2018 at 03:32:27PM -0300, Eduardo Habkost wrote:
> On Tue, Mar 20, 2018 at 06:34:59PM +0100, Vitaly Kuznetsov wrote:
> > @@ -1747,6 +1760,15 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> >              if (cpu->hyperv_time) {
> >                  kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC,
> >                                    env->msr_hv_tsc);
> > +
> > +                if (has_msr_hv_reenlightenment) {
> 
> I see that the current code is inconsistent: some entries check
> for has_msr_hv_*, other entries check cpu->hyperv_*.
> 
> I suggest changing all of them (including this one) to check
> cpu->hyperv_* instead.
> 
> The difference between both approaches is that checking just
> has_msr_hv_* would let a non-cooperating guest prevent itself
> from being migrated to an older host by writing a non-zero value
> to a MSR, even if hyperv support was not enabled in the VM
> configuration at all.  I don't think we want that.

Agreed.  We accumulated a number of these over time; it's mostly my
fault, so I don't feel it's just to ask Vitaly to fix the existing ones,
but let's not add new ones.  

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs
  2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
  2018-03-20 18:32   ` Eduardo Habkost
@ 2018-03-21 11:24   ` Roman Kagan
  2018-03-22 17:09   ` Marcelo Tosatti
  2 siblings, 0 replies; 25+ messages in thread
From: Roman Kagan @ 2018-03-21 11:24 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, qemu-devel

On Tue, Mar 20, 2018 at 06:34:59PM +0100, Vitaly Kuznetsov wrote:
> KVM recently gained support for Hyper-V Reenlightenment MSRs which are
> required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
> when INVTSC is not passed to it (and it is not passed by default in Qemu
> as it effectively blocks migration).
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> Changes since v2:
> - add hv-reenlightenment CPU property [Roman Kagan, Paolo Bonzini]
> - add a comment to feature_word_info [Roman Kagan]
> ---
>  target/i386/cpu.c          |  4 +++-
>  target/i386/cpu.h          |  4 ++++
>  target/i386/hyperv-proto.h |  9 ++++++++-
>  target/i386/kvm.c          | 39 ++++++++++++++++++++++++++++++++++++++-
>  target/i386/machine.c      | 24 ++++++++++++++++++++++++
>  5 files changed, 77 insertions(+), 3 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 6bb4ce8719..02579f8234 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -407,7 +407,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>              NULL /* hv_vpindex_access */, NULL /* hv_msr_reset_access */,
>              NULL /* hv_msr_stats_access */, NULL /* hv_reftsc_access */,
>              NULL /* hv_msr_idle_access */, NULL /* hv_msr_frequency_access */,
> -            NULL, NULL, NULL, NULL,
> +            NULL /* hv_msr_debug_access */, NULL /* hv_msr_reenlightenment_access */,
> +            NULL, NULL,
>              NULL, NULL, NULL, NULL,
>              NULL, NULL, NULL, NULL,
>              NULL, NULL, NULL, NULL,
> @@ -4764,6 +4765,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("hv-runtime", X86CPU, hyperv_runtime, false),
>      DEFINE_PROP_BOOL("hv-synic", X86CPU, hyperv_synic, false),
>      DEFINE_PROP_BOOL("hv-stimer", X86CPU, hyperv_stimer, false),
> +    DEFINE_PROP_BOOL("hv-reenlightenment", X86CPU, hyperv_reenlightenment, false),
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 2e2bab5ff3..98eed72937 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1174,6 +1174,9 @@ typedef struct CPUX86State {
>      uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
>      uint64_t msr_hv_stimer_config[HV_STIMER_COUNT];
>      uint64_t msr_hv_stimer_count[HV_STIMER_COUNT];
> +    uint64_t msr_hv_reenlightenment_control;
> +    uint64_t msr_hv_tsc_emulation_control;
> +    uint64_t msr_hv_tsc_emulation_status;
>  
>      uint64_t msr_rtit_ctrl;
>      uint64_t msr_rtit_status;
> @@ -1296,6 +1299,7 @@ struct X86CPU {
>      bool hyperv_runtime;
>      bool hyperv_synic;
>      bool hyperv_stimer;
> +    bool hyperv_reenlightenment;
>      bool check_cpuid;
>      bool enforce_cpuid;
>      bool expose_kvm;
> diff --git a/target/i386/hyperv-proto.h b/target/i386/hyperv-proto.h
> index cb4d7f2b7a..93352ebd2a 100644
> --- a/target/i386/hyperv-proto.h
> +++ b/target/i386/hyperv-proto.h
> @@ -35,7 +35,7 @@
>  #define HV_RESET_AVAILABLE           (1u << 7)
>  #define HV_REFERENCE_TSC_AVAILABLE   (1u << 9)
>  #define HV_ACCESS_FREQUENCY_MSRS     (1u << 11)
> -
> +#define HV_ACCESS_REENLIGHTENMENTS_CONTROL  (1u << 13)
>  
>  /*
>   * HV_CPUID_FEATURES.EDX bits
> @@ -129,6 +129,13 @@
>  #define HV_X64_MSR_CRASH_CTL                    0x40000105
>  #define HV_CRASH_CTL_NOTIFY                     (1ull << 63)
>  
> +/*
> + * Reenlightenment notification MSRs
> + */
> +#define HV_X64_MSR_REENLIGHTENMENT_CONTROL      0x40000106
> +#define HV_X64_MSR_TSC_EMULATION_CONTROL        0x40000107
> +#define HV_X64_MSR_TSC_EMULATION_STATUS         0x40000108
> +
>  /*
>   * Hypercall status code
>   */
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index d23fff12f5..7d9f9ca0b1 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -90,6 +90,7 @@ static bool has_msr_hv_runtime;
>  static bool has_msr_hv_synic;
>  static bool has_msr_hv_stimer;
>  static bool has_msr_hv_frequencies;
> +static bool has_msr_hv_reenlightenment;
>  static bool has_msr_xss;
>  static bool has_msr_spec_ctrl;
>  static bool has_msr_smi_count;
> @@ -583,7 +584,8 @@ static bool hyperv_enabled(X86CPU *cpu)
>              cpu->hyperv_vpindex ||
>              cpu->hyperv_runtime ||
>              cpu->hyperv_synic ||
> -            cpu->hyperv_stimer);
> +            cpu->hyperv_stimer ||
> +            cpu->hyperv_reenlightenment);
>  }
>  
>  static int kvm_arch_set_tsc_khz(CPUState *cs)
> @@ -654,6 +656,14 @@ static int hyperv_handle_properties(CPUState *cs)
>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
>          }
>      }
> +    if (cpu->hyperv_reenlightenment) {
> +        if (!has_msr_hv_reenlightenment) {
> +            fprintf(stderr,
> +                    "Hyper-V Reenlightenment is not supported by kernel\n");
> +            return -ENOSYS;
> +        }
> +        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
> +    }
>      if (cpu->hyperv_crash && has_msr_hv_crash) {
>          env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
>      }
> @@ -1185,6 +1195,9 @@ static int kvm_get_supported_msrs(KVMState *s)
>                  case HV_X64_MSR_TSC_FREQUENCY:
>                      has_msr_hv_frequencies = true;
>                      break;
> +                case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
> +                    has_msr_hv_reenlightenment = true;
> +                    break;
>                  case MSR_IA32_SPEC_CTRL:
>                      has_msr_spec_ctrl = true;
>                      break;
> @@ -1747,6 +1760,15 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>              if (cpu->hyperv_time) {
>                  kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC,
>                                    env->msr_hv_tsc);
> +
> +                if (has_msr_hv_reenlightenment) {
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_REENLIGHTENMENT_CONTROL,
> +                                      env->msr_hv_reenlightenment_control);
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_CONTROL,
> +                                      env->msr_hv_tsc_emulation_control);
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
> +                                      env->msr_hv_tsc_emulation_status);
> +                }

I second Eduardo's comment on testing cpu->hyperv_reenlightenment here.

Besides, this hunk suggests that (!cpu->hyperv_time &&
cpu->hyperv_reenlightenment) is illegal.  I think this should be
enforced when enabling the feature.  BTW this also makes the addition to
hyperv_enabled() unnecessary.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
@ 2018-03-21 12:10   ` Roman Kagan
  2018-03-21 13:18     ` Vitaly Kuznetsov
  2018-03-21 15:33   ` Paolo Bonzini
  1 sibling, 1 reply; 25+ messages in thread
From: Roman Kagan @ 2018-03-21 12:10 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, qemu-devel

On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> these frequencies are stable, it just means they're available for reading.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  target/i386/kvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 7d9f9ca0b1..74fc3d3b2c 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
>  
> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> +        if (has_msr_hv_frequencies && env->tsc_khz) {
>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
>          }

I suggest that we add a corresponding cpu property here, too.  The guest
may legitimately rely on these msrs when it sees the support in CPUID,
and migrating from a kernel with the feature supported (4.14+) to an
older one will make it crash.

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs
  2018-03-20 18:32   ` Eduardo Habkost
  2018-03-21 11:09     ` Roman Kagan
@ 2018-03-21 13:09     ` Roman Kagan
  1 sibling, 0 replies; 25+ messages in thread
From: Roman Kagan @ 2018-03-21 13:09 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Tue, Mar 20, 2018 at 03:32:27PM -0300, Eduardo Habkost wrote:
> The difference between both approaches is that checking just
> has_msr_hv_* would let a non-cooperating guest prevent itself
> from being migrated to an older host by writing a non-zero value
> to a MSR, even if hyperv support was not enabled in the VM
> configuration at all.  I don't think we want that.

Thinking some more of what we do want in this regard, I wonder if we're
doing the right thing by not generating #GP on access to MSRs whose
support is present in KVM but disabled via cpu properties in QEMU?

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 12:10   ` Roman Kagan
@ 2018-03-21 13:18     ` Vitaly Kuznetsov
  2018-03-21 16:57       ` Roman Kagan
  0 siblings, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-21 13:18 UTC (permalink / raw)
  To: Roman Kagan
  Cc: Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, qemu-devel

Roman Kagan <rkagan@virtuozzo.com> writes:

> On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
>> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
>> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
>> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
>> these frequencies are stable, it just means they're available for reading.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  target/i386/kvm.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
>> index 7d9f9ca0b1..74fc3d3b2c 100644
>> --- a/target/i386/kvm.c
>> +++ b/target/i386/kvm.c
>> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
>>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
>>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
>>  
>> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
>> +        if (has_msr_hv_frequencies && env->tsc_khz) {
>>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
>>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
>>          }
>
> I suggest that we add a corresponding cpu property here, too.  The guest
> may legitimately rely on these msrs when it sees the support in CPUID,
> and migrating from a kernel with the feature supported (4.14+) to an
> older one will make it crash.
>

This can be arranged, but what happens to people who use these features
today? Assuming they also passed 'invtsc' they have stable TSC page
clocksource already (when Hyper-V role is enabled) but when we start
requesting a new 'hv_frequency' cpu property they'll suddenly lose what
they have...

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
  2018-03-21 12:10   ` Roman Kagan
@ 2018-03-21 15:33   ` Paolo Bonzini
  2018-03-21 16:17     ` Vitaly Kuznetsov
  2018-03-21 16:47     ` Roman Kagan
  1 sibling, 2 replies; 25+ messages in thread
From: Paolo Bonzini @ 2018-03-21 15:33 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Richard Henderson, Eduardo Habkost, Marcelo Tosatti, qemu-devel,
	Roman Kagan

On 20/03/2018 18:35, Vitaly Kuznetsov wrote:
> +        if (has_msr_hv_frequencies && env->tsc_khz) {
>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
>          }

Since you have added cpu->hyperv_reenlightenment, I'd rather change this
so that we don't make the "license to change guest ABI across migration"
apply more.  We can exploit the fact that Windows doesn't even use the
MSRs unless either invtsc or re-enlightenment is present.  Something
like this:

       if (has_msr_hv_frequencies && env->tsc_khz &&
	   (tsc_is_stable_and_known(env) ||
            cpu->hyperv_reenlightenment))

will make the MSRs visible in all useful cases, without having to add
yet another knob.

(Don't worry, this backwards-compatibility stuff is the hardest part.
I'm so happy that Eduardo is the one maintaining it :)).

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 15:33   ` Paolo Bonzini
@ 2018-03-21 16:17     ` Vitaly Kuznetsov
  2018-03-21 17:17       ` Roman Kagan
  2018-03-21 16:47     ` Roman Kagan
  1 sibling, 1 reply; 25+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-21 16:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Richard Henderson, Eduardo Habkost, Marcelo Tosatti, qemu-devel,
	Roman Kagan

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 20/03/2018 18:35, Vitaly Kuznetsov wrote:
>> +        if (has_msr_hv_frequencies && env->tsc_khz) {
>>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
>>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
>>          }
>
> Since you have added cpu->hyperv_reenlightenment, I'd rather change this
> so that we don't make the "license to change guest ABI across migration"
> apply more.  We can exploit the fact that Windows doesn't even use the
> MSRs unless either invtsc or re-enlightenment is present.  Something
> like this:
>
>        if (has_msr_hv_frequencies && env->tsc_khz &&
> 	   (tsc_is_stable_and_known(env) ||
>             cpu->hyperv_reenlightenment))
>
> will make the MSRs visible in all useful cases, without having to add
> yet another knob.
>

Can be arranged, of course.

(What I'm worried about with all our hv_* knobs is that more of them we
have easier it is to assemble some frankenstien which won't look like
any existing Hyper-V version; we're probably not doing a very good job
tesing all possible hv_* combinations as people probably use 'all or
nothing'. In case we end up finding a bug in Windows with some weird
hv_* combination it's unlikely Microsoft will bother fixing at as it
doesn't reproduce on any existent Hyper-V version.

That said, it would be great to eventually have something like
'hv_ws2012r2' property making us look exactly the same real WS2012R2
looks like. Unfortunatelly, I'm unsure about a path to get there).

> (Don't worry, this backwards-compatibility stuff is the hardest part.
> I'm so happy that Eduardo is the one maintaining it :)).

I feel the pain :-) Thanks for the reviews!

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 15:33   ` Paolo Bonzini
  2018-03-21 16:17     ` Vitaly Kuznetsov
@ 2018-03-21 16:47     ` Roman Kagan
  1 sibling, 0 replies; 25+ messages in thread
From: Roman Kagan @ 2018-03-21 16:47 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, qemu-devel

On Wed, Mar 21, 2018 at 04:33:33PM +0100, Paolo Bonzini wrote:
> On 20/03/2018 18:35, Vitaly Kuznetsov wrote:
> > +        if (has_msr_hv_frequencies && env->tsc_khz) {
> >              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> >              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> >          }
> 
> Since you have added cpu->hyperv_reenlightenment, I'd rather change this
> so that we don't make the "license to change guest ABI across migration"
> apply more.  We can exploit the fact that Windows doesn't even use the
> MSRs unless either invtsc or re-enlightenment is present.

That's not a given.  It reportedly doesn't use these MSRs *now*, but I
see no reason for it not to start using it at some point, say, to avoid
TSC calibration.  E.g. Linux already does so with its hv_get_tsc_khz().

And the bad thing is that it'll kill your guest with #GP when you
migrate from a new KVM to an old one.

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 13:18     ` Vitaly Kuznetsov
@ 2018-03-21 16:57       ` Roman Kagan
  2018-03-21 20:19         ` Eduardo Habkost
  0 siblings, 1 reply; 25+ messages in thread
From: Roman Kagan @ 2018-03-21 16:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, qemu-devel

On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> Roman Kagan <rkagan@virtuozzo.com> writes:
> 
> > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> >> these frequencies are stable, it just means they're available for reading.
> >> 
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> ---
> >>  target/i386/kvm.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> >> index 7d9f9ca0b1..74fc3d3b2c 100644
> >> --- a/target/i386/kvm.c
> >> +++ b/target/i386/kvm.c
> >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> >>  
> >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> >>          }
> >
> > I suggest that we add a corresponding cpu property here, too.  The guest
> > may legitimately rely on these msrs when it sees the support in CPUID,
> > and migrating from a kernel with the feature supported (4.14+) to an
> > older one will make it crash.
> >
> 
> This can be arranged, but what happens to people who use these features
> today? Assuming they also passed 'invtsc' they have stable TSC page
> clocksource already (when Hyper-V role is enabled) but when we start
> requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> they have...

I see two cases here:

1) people start a new VM, and discover that their old configuration is
   not enough for this feature to work.

   They need to reconfigure and restart the VM.  This costs them some
   time investigating and restarting, but not data.

2) people migrate from a QEMU without ->hv_frequency, to a new one with
   ->hv_frequency=off (assuming on both ends KVM supports the frequency
   MSRs).

   With the current implementation in KVM, this will only result in the
   feature bits disappearing from the respective CPUID leaf, but the
   MSRs themselves will continue working as they used to.  So the guest
   either won't notice or will check the CPUID and adjust.


Am I missing anything?

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 16:17     ` Vitaly Kuznetsov
@ 2018-03-21 17:17       ` Roman Kagan
  2018-03-21 20:06         ` Eduardo Habkost
  0 siblings, 1 reply; 25+ messages in thread
From: Roman Kagan @ 2018-03-21 17:17 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, qemu-devel

On Wed, Mar 21, 2018 at 05:17:38PM +0100, Vitaly Kuznetsov wrote:
> (What I'm worried about with all our hv_* knobs is that more of them we
> have easier it is to assemble some frankenstien which won't look like
> any existing Hyper-V version; we're probably not doing a very good job
> tesing all possible hv_* combinations as people probably use 'all or
> nothing'. In case we end up finding a bug in Windows with some weird
> hv_* combination it's unlikely Microsoft will bother fixing at as it
> doesn't reproduce on any existent Hyper-V version.

I agree that this is getting cumbersome, but, given that features get
added incrementally and we need to be able to maintain backwards
compatibility, I'm afraid this is unavoidable.

> That said, it would be great to eventually have something like
> 'hv_ws2012r2' property making us look exactly the same real WS2012R2
> looks like. Unfortunatelly, I'm unsure about a path to get there).

I'm tempted to delegate this -- combining features into user-friendly
sets -- to the upper layers: libvirt or even something on top of it.

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 17:17       ` Roman Kagan
@ 2018-03-21 20:06         ` Eduardo Habkost
  0 siblings, 0 replies; 25+ messages in thread
From: Eduardo Habkost @ 2018-03-21 20:06 UTC (permalink / raw)
  To: Roman Kagan, Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Wed, Mar 21, 2018 at 08:17:55PM +0300, Roman Kagan wrote:
> On Wed, Mar 21, 2018 at 05:17:38PM +0100, Vitaly Kuznetsov wrote:
> > (What I'm worried about with all our hv_* knobs is that more of them we
> > have easier it is to assemble some frankenstien which won't look like
> > any existing Hyper-V version; we're probably not doing a very good job
> > tesing all possible hv_* combinations as people probably use 'all or
> > nothing'. In case we end up finding a bug in Windows with some weird
> > hv_* combination it's unlikely Microsoft will bother fixing at as it
> > doesn't reproduce on any existent Hyper-V version.
> 
> I agree that this is getting cumbersome, but, given that features get
> added incrementally and we need to be able to maintain backwards
> compatibility, I'm afraid this is unavoidable.
> 
> > That said, it would be great to eventually have something like
> > 'hv_ws2012r2' property making us look exactly the same real WS2012R2
> > looks like. Unfortunatelly, I'm unsure about a path to get there).
> 
> I'm tempted to delegate this -- combining features into user-friendly
> sets -- to the upper layers: libvirt or even something on top of it.

Sounds simpler to me, otherwise we would need a mechanism to tell
the upper layers which of those named are usable on the current
host.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 16:57       ` Roman Kagan
@ 2018-03-21 20:19         ` Eduardo Habkost
  2018-03-22 13:00           ` Roman Kagan
  0 siblings, 1 reply; 25+ messages in thread
From: Eduardo Habkost @ 2018-03-21 20:19 UTC (permalink / raw)
  To: Roman Kagan, Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > Roman Kagan <rkagan@virtuozzo.com> writes:
> > 
> > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > >> these frequencies are stable, it just means they're available for reading.
> > >> 
> > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > >> ---
> > >>  target/i386/kvm.c | 2 +-
> > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >> 
> > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > >> --- a/target/i386/kvm.c
> > >> +++ b/target/i386/kvm.c
> > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > >>  
> > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > >>          }
> > >
> > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > and migrating from a kernel with the feature supported (4.14+) to an
> > > older one will make it crash.
> > >
> > 
> > This can be arranged, but what happens to people who use these features
> > today? Assuming they also passed 'invtsc' they have stable TSC page
> > clocksource already (when Hyper-V role is enabled) but when we start
> > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > they have...
> 
> I see two cases here:
> 
> 1) people start a new VM, and discover that their old configuration is
>    not enough for this feature to work.
> 
>    They need to reconfigure and restart the VM.  This costs them some
>    time investigating and restarting, but not data.

If we keep machine-type compatibility, people will need to do
that only if they change the machine-type (or use the "pc" or
"q35" aliases).  If they copy the old configuration, it will keep
working.

machine-type compatibility also makes the following case a bit
safer:

> 
> 2) people migrate from a QEMU without ->hv_frequency, to a new one with
>    ->hv_frequency=off (assuming on both ends KVM supports the frequency
>    MSRs).
> 
>    With the current implementation in KVM, this will only result in the
>    feature bits disappearing from the respective CPUID leaf, but the
>    MSRs themselves will continue working as they used to.  So the guest
>    either won't notice or will check the CPUID and adjust.

If we keep machine-type compatibility, the CPUID bit won't
disappear for the guest while the MSRs keep working.


Whichever solution we choose, we can still have guests crashing
if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
with an older kernel.  But I don't think there's a way out of
this, except requiring an explicit "hv-frequencies" CPU option on
newer machine-types.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-21 20:19         ` Eduardo Habkost
@ 2018-03-22 13:00           ` Roman Kagan
  2018-03-22 13:22             ` Eduardo Habkost
  0 siblings, 1 reply; 25+ messages in thread
From: Roman Kagan @ 2018-03-22 13:00 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > 
> > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > >> these frequencies are stable, it just means they're available for reading.
> > > >> 
> > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > >> ---
> > > >>  target/i386/kvm.c | 2 +-
> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >> 
> > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > >> --- a/target/i386/kvm.c
> > > >> +++ b/target/i386/kvm.c
> > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > >>  
> > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > >>          }
> > > >
> > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > older one will make it crash.
> > > >
> > > 
> > > This can be arranged, but what happens to people who use these features
> > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > clocksource already (when Hyper-V role is enabled) but when we start
> > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > they have...
> > 
> > I see two cases here:
> > 
> > 1) people start a new VM, and discover that their old configuration is
> >    not enough for this feature to work.
> > 
> >    They need to reconfigure and restart the VM.  This costs them some
> >    time investigating and restarting, but not data.
> 
> If we keep machine-type compatibility, people will need to do
> that only if they change the machine-type (or use the "pc" or
> "q35" aliases).  If they copy the old configuration, it will keep
> working.

The problem is that the feature is not fixed by the machine-type, due to
the forgotten property: it only depends on the KVM version.  So, once
(if) we add the property and make the feature deterministic, we'll lose
compatibility one way or another.

Or are you suggesting that for pre-2.12 machine types we leave the
property at "decided by your KVM" state?

> 
> machine-type compatibility also makes the following case a bit
> safer:
> 
> > 
> > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> >    MSRs).
> > 
> >    With the current implementation in KVM, this will only result in the
> >    feature bits disappearing from the respective CPUID leaf, but the
> >    MSRs themselves will continue working as they used to.  So the guest
> >    either won't notice or will check the CPUID and adjust.
> 
> If we keep machine-type compatibility, the CPUID bit won't
> disappear for the guest while the MSRs keep working.
> 
> 
> Whichever solution we choose, we can still have guests crashing
> if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> with an older kernel.  But I don't think there's a way out of
> this, except requiring an explicit "hv-frequencies" CPU option on
> newer machine-types.

What's wrong with requiring it, as we do for all other hv_* properties?

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-22 13:00           ` Roman Kagan
@ 2018-03-22 13:22             ` Eduardo Habkost
  2018-03-22 13:58               ` Roman Kagan
  0 siblings, 1 reply; 25+ messages in thread
From: Eduardo Habkost @ 2018-03-22 13:22 UTC (permalink / raw)
  To: Roman Kagan, Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > 
> > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > >> these frequencies are stable, it just means they're available for reading.
> > > > >> 
> > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > >> ---
> > > > >>  target/i386/kvm.c | 2 +-
> > > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >> 
> > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > >> --- a/target/i386/kvm.c
> > > > >> +++ b/target/i386/kvm.c
> > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > >>  
> > > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > >>          }
> > > > >
> > > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > older one will make it crash.
> > > > >
> > > > 
> > > > This can be arranged, but what happens to people who use these features
> > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > they have...
> > > 
> > > I see two cases here:
> > > 
> > > 1) people start a new VM, and discover that their old configuration is
> > >    not enough for this feature to work.
> > > 
> > >    They need to reconfigure and restart the VM.  This costs them some
> > >    time investigating and restarting, but not data.
> > 
> > If we keep machine-type compatibility, people will need to do
> > that only if they change the machine-type (or use the "pc" or
> > "q35" aliases).  If they copy the old configuration, it will keep
> > working.
> 
> The problem is that the feature is not fixed by the machine-type, due to
> the forgotten property: it only depends on the KVM version.  So, once
> (if) we add the property and make the feature deterministic, we'll lose
> compatibility one way or another.
> 
> Or are you suggesting that for pre-2.12 machine types we leave the
> property at "decided by your KVM" state?

Yes, that's what I mean.  This looks like the only way to avoid
losing features by just cold-rebooting an existing VM.

The scenario I'm thinking is this:

1) pc-2.11 VM started on host running QEMU 2.11
2) VM migrated to a host containing this patch
3) 1 year later, the VM is shut down and booted again.
4) Things stop working inside the VM because hv-frequency is
   unexpectedly gone.

Machine-type compatibility code would avoid (4).


> 
> > 
> > machine-type compatibility also makes the following case a bit
> > safer:
> > 
> > > 
> > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > >    MSRs).
> > > 
> > >    With the current implementation in KVM, this will only result in the
> > >    feature bits disappearing from the respective CPUID leaf, but the
> > >    MSRs themselves will continue working as they used to.  So the guest
> > >    either won't notice or will check the CPUID and adjust.
> > 
> > If we keep machine-type compatibility, the CPUID bit won't
> > disappear for the guest while the MSRs keep working.
> > 
> > 
> > Whichever solution we choose, we can still have guests crashing
> > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > with an older kernel.  But I don't think there's a way out of
> > this, except requiring an explicit "hv-frequencies" CPU option on
> > newer machine-types.
> 
> What's wrong with requiring it, as we do for all other hv_* properties?

On new machine-types, nothing wrong.

On existing machine-types, see above.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-22 13:22             ` Eduardo Habkost
@ 2018-03-22 13:58               ` Roman Kagan
  2018-03-22 18:38                 ` Eduardo Habkost
  0 siblings, 1 reply; 25+ messages in thread
From: Roman Kagan @ 2018-03-22 13:58 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Thu, Mar 22, 2018 at 10:22:18AM -0300, Eduardo Habkost wrote:
> On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> > On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > > 
> > > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > > >> these frequencies are stable, it just means they're available for reading.
> > > > > >> 
> > > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > > >> ---
> > > > > >>  target/i386/kvm.c | 2 +-
> > > > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >> 
> > > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > > >> --- a/target/i386/kvm.c
> > > > > >> +++ b/target/i386/kvm.c
> > > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > > >>  
> > > > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > > >>          }
> > > > > >
> > > > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > > older one will make it crash.
> > > > > >
> > > > > 
> > > > > This can be arranged, but what happens to people who use these features
> > > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > > they have...
> > > > 
> > > > I see two cases here:
> > > > 
> > > > 1) people start a new VM, and discover that their old configuration is
> > > >    not enough for this feature to work.
> > > > 
> > > >    They need to reconfigure and restart the VM.  This costs them some
> > > >    time investigating and restarting, but not data.
> > > 
> > > If we keep machine-type compatibility, people will need to do
> > > that only if they change the machine-type (or use the "pc" or
> > > "q35" aliases).  If they copy the old configuration, it will keep
> > > working.
> > 
> > The problem is that the feature is not fixed by the machine-type, due to
> > the forgotten property: it only depends on the KVM version.  So, once
> > (if) we add the property and make the feature deterministic, we'll lose
> > compatibility one way or another.
> > 
> > Or are you suggesting that for pre-2.12 machine types we leave the
> > property at "decided by your KVM" state?
> 
> Yes, that's what I mean.  This looks like the only way to avoid
> losing features by just cold-rebooting an existing VM.
> 
> The scenario I'm thinking is this:
> 
> 1) pc-2.11 VM started on host running QEMU 2.11
> 2) VM migrated to a host containing this patch
> 3) 1 year later, the VM is shut down and booted again.
> 4) Things stop working inside the VM because hv-frequency is
>    unexpectedly gone.
> 
> Machine-type compatibility code would avoid (4).

Right, but (4) typically means that you fail to start your workload from
a clean state, so you just go and fix it; no data is lost.

Compare this to a migration to an older KVM which results in your guest
crashing, where you risk data loss and still have to meddle with
configs.

> > > machine-type compatibility also makes the following case a bit
> > > safer:
> > > 
> > > > 
> > > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > > >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > > >    MSRs).
> > > > 
> > > >    With the current implementation in KVM, this will only result in the
> > > >    feature bits disappearing from the respective CPUID leaf, but the
> > > >    MSRs themselves will continue working as they used to.  So the guest
> > > >    either won't notice or will check the CPUID and adjust.
> > > 
> > > If we keep machine-type compatibility, the CPUID bit won't
> > > disappear for the guest while the MSRs keep working.
> > > 
> > > 
> > > Whichever solution we choose, we can still have guests crashing
> > > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > > with an older kernel.  But I don't think there's a way out of
> > > this, except requiring an explicit "hv-frequencies" CPU option on
> > > newer machine-types.
> > 
> > What's wrong with requiring it, as we do for all other hv_* properties?
> 
> On new machine-types, nothing wrong.
> 
> On existing machine-types, see above.

I wonder if the following can cater to all relevant cases:

- hv_frequencies property is added, defaulting to "off", so that new
  users of this feature would need to explicitly turn it on;

- on pre-2.12 machine types, it's set to the value of hv_time property
  by the compat code, so that on VMs where this feature could
  potentially be present it would become required; as a result, these
  configurations will refuse to start on insufficiently capable KVM,
  preventing the migration attempts.

Am I missing any scenarios that aren't covered?

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs
  2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
  2018-03-20 18:32   ` Eduardo Habkost
  2018-03-21 11:24   ` Roman Kagan
@ 2018-03-22 17:09   ` Marcelo Tosatti
  2018-03-22 17:39     ` Vitaly Kuznetsov
  2 siblings, 1 reply; 25+ messages in thread
From: Marcelo Tosatti @ 2018-03-22 17:09 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Richard Henderson, Eduardo Habkost, qemu-devel,
	Roman Kagan

On Tue, Mar 20, 2018 at 06:34:59PM +0100, Vitaly Kuznetsov wrote:
> KVM recently gained support for Hyper-V Reenlightenment MSRs which are
> required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
> when INVTSC is not passed to it (and it is not passed by default in Qemu
> as it effectively blocks migration).

Hi Vitaly,

>From Microsoft's documentation:

"An L1 hypervisor can request to be notified when its partition is
migrated. This capability is enumerated in CPUID as
AccessReenlightenmentControls privilege (see 2.4.10)."

The L0 hypervisor exposes a synthetic MSR
(HV_X64_MSR_REENLIGHTENMENT_CONTROL) that may be used by the L1
hypervisor to configure an interrupt vector and target processor. The L0
hypervisor will inject an interrupt with the specified vector after each
migration.

What prevents a guest from setting the enable bit, and expect
to receive an interrupt, if the reenlightenment MSRs are exposed ?

> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> Changes since v2:
> - add hv-reenlightenment CPU property [Roman Kagan, Paolo Bonzini]
> - add a comment to feature_word_info [Roman Kagan]
> ---
>  target/i386/cpu.c          |  4 +++-
>  target/i386/cpu.h          |  4 ++++
>  target/i386/hyperv-proto.h |  9 ++++++++-
>  target/i386/kvm.c          | 39 ++++++++++++++++++++++++++++++++++++++-
>  target/i386/machine.c      | 24 ++++++++++++++++++++++++
>  5 files changed, 77 insertions(+), 3 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 6bb4ce8719..02579f8234 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -407,7 +407,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
>              NULL /* hv_vpindex_access */, NULL /* hv_msr_reset_access */,
>              NULL /* hv_msr_stats_access */, NULL /* hv_reftsc_access */,
>              NULL /* hv_msr_idle_access */, NULL /* hv_msr_frequency_access */,
> -            NULL, NULL, NULL, NULL,
> +            NULL /* hv_msr_debug_access */, NULL /* hv_msr_reenlightenment_access */,
> +            NULL, NULL,
>              NULL, NULL, NULL, NULL,
>              NULL, NULL, NULL, NULL,
>              NULL, NULL, NULL, NULL,
> @@ -4764,6 +4765,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("hv-runtime", X86CPU, hyperv_runtime, false),
>      DEFINE_PROP_BOOL("hv-synic", X86CPU, hyperv_synic, false),
>      DEFINE_PROP_BOOL("hv-stimer", X86CPU, hyperv_stimer, false),
> +    DEFINE_PROP_BOOL("hv-reenlightenment", X86CPU, hyperv_reenlightenment, false),
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 2e2bab5ff3..98eed72937 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1174,6 +1174,9 @@ typedef struct CPUX86State {
>      uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
>      uint64_t msr_hv_stimer_config[HV_STIMER_COUNT];
>      uint64_t msr_hv_stimer_count[HV_STIMER_COUNT];
> +    uint64_t msr_hv_reenlightenment_control;
> +    uint64_t msr_hv_tsc_emulation_control;
> +    uint64_t msr_hv_tsc_emulation_status;
>  
>      uint64_t msr_rtit_ctrl;
>      uint64_t msr_rtit_status;
> @@ -1296,6 +1299,7 @@ struct X86CPU {
>      bool hyperv_runtime;
>      bool hyperv_synic;
>      bool hyperv_stimer;
> +    bool hyperv_reenlightenment;
>      bool check_cpuid;
>      bool enforce_cpuid;
>      bool expose_kvm;
> diff --git a/target/i386/hyperv-proto.h b/target/i386/hyperv-proto.h
> index cb4d7f2b7a..93352ebd2a 100644
> --- a/target/i386/hyperv-proto.h
> +++ b/target/i386/hyperv-proto.h
> @@ -35,7 +35,7 @@
>  #define HV_RESET_AVAILABLE           (1u << 7)
>  #define HV_REFERENCE_TSC_AVAILABLE   (1u << 9)
>  #define HV_ACCESS_FREQUENCY_MSRS     (1u << 11)
> -
> +#define HV_ACCESS_REENLIGHTENMENTS_CONTROL  (1u << 13)
>  
>  /*
>   * HV_CPUID_FEATURES.EDX bits
> @@ -129,6 +129,13 @@
>  #define HV_X64_MSR_CRASH_CTL                    0x40000105
>  #define HV_CRASH_CTL_NOTIFY                     (1ull << 63)
>  
> +/*
> + * Reenlightenment notification MSRs
> + */
> +#define HV_X64_MSR_REENLIGHTENMENT_CONTROL      0x40000106
> +#define HV_X64_MSR_TSC_EMULATION_CONTROL        0x40000107
> +#define HV_X64_MSR_TSC_EMULATION_STATUS         0x40000108
> +
>  /*
>   * Hypercall status code
>   */
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index d23fff12f5..7d9f9ca0b1 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -90,6 +90,7 @@ static bool has_msr_hv_runtime;
>  static bool has_msr_hv_synic;
>  static bool has_msr_hv_stimer;
>  static bool has_msr_hv_frequencies;
> +static bool has_msr_hv_reenlightenment;
>  static bool has_msr_xss;
>  static bool has_msr_spec_ctrl;
>  static bool has_msr_smi_count;
> @@ -583,7 +584,8 @@ static bool hyperv_enabled(X86CPU *cpu)
>              cpu->hyperv_vpindex ||
>              cpu->hyperv_runtime ||
>              cpu->hyperv_synic ||
> -            cpu->hyperv_stimer);
> +            cpu->hyperv_stimer ||
> +            cpu->hyperv_reenlightenment);
>  }
>  
>  static int kvm_arch_set_tsc_khz(CPUState *cs)
> @@ -654,6 +656,14 @@ static int hyperv_handle_properties(CPUState *cs)
>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
>          }
>      }
> +    if (cpu->hyperv_reenlightenment) {
> +        if (!has_msr_hv_reenlightenment) {
> +            fprintf(stderr,
> +                    "Hyper-V Reenlightenment is not supported by kernel\n");
> +            return -ENOSYS;
> +        }
> +        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
> +    }
>      if (cpu->hyperv_crash && has_msr_hv_crash) {
>          env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
>      }
> @@ -1185,6 +1195,9 @@ static int kvm_get_supported_msrs(KVMState *s)
>                  case HV_X64_MSR_TSC_FREQUENCY:
>                      has_msr_hv_frequencies = true;
>                      break;
> +                case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
> +                    has_msr_hv_reenlightenment = true;
> +                    break;
>                  case MSR_IA32_SPEC_CTRL:
>                      has_msr_spec_ctrl = true;
>                      break;
> @@ -1747,6 +1760,15 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>              if (cpu->hyperv_time) {
>                  kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC,
>                                    env->msr_hv_tsc);
> +
> +                if (has_msr_hv_reenlightenment) {
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_REENLIGHTENMENT_CONTROL,
> +                                      env->msr_hv_reenlightenment_control);
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_CONTROL,
> +                                      env->msr_hv_tsc_emulation_control);
> +                    kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
> +                                      env->msr_hv_tsc_emulation_status);
> +                }
>              }
>          }
>          if (cpu->hyperv_vapic) {
> @@ -2109,6 +2131,12 @@ static int kvm_get_msrs(X86CPU *cpu)
>      }
>      if (cpu->hyperv_time) {
>          kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC, 0);
> +
> +        if (has_msr_hv_reenlightenment) {
> +            kvm_msr_entry_add(cpu, HV_X64_MSR_REENLIGHTENMENT_CONTROL, 0);
> +            kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_CONTROL, 0);
> +            kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS, 0);
> +        }
>      }
>      if (has_msr_hv_crash) {
>          int j;
> @@ -2367,6 +2395,15 @@ static int kvm_get_msrs(X86CPU *cpu)
>              env->msr_hv_stimer_count[(index - HV_X64_MSR_STIMER0_COUNT)/2] =
>                                  msrs[i].data;
>              break;
> +        case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
> +            env->msr_hv_reenlightenment_control = msrs[i].data;
> +            break;
> +        case HV_X64_MSR_TSC_EMULATION_CONTROL:
> +            env->msr_hv_tsc_emulation_control = msrs[i].data;
> +            break;
> +        case HV_X64_MSR_TSC_EMULATION_STATUS:
> +            env->msr_hv_tsc_emulation_status = msrs[i].data;
> +            break;
>          case MSR_MTRRdefType:
>              env->mtrr_deftype = msrs[i].data;
>              break;
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index bd2d82e91b..fd99c0bbb4 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -713,6 +713,29 @@ static const VMStateDescription vmstate_msr_hyperv_stimer = {
>      }
>  };
>  
> +static bool hyperv_reenlightenment_enable_needed(void *opaque)
> +{
> +    X86CPU *cpu = opaque;
> +    CPUX86State *env = &cpu->env;
> +
> +    return env->msr_hv_reenlightenment_control != 0 ||
> +        env->msr_hv_tsc_emulation_control != 0 ||
> +        env->msr_hv_tsc_emulation_status != 0;
> +}
> +
> +static const VMStateDescription vmstate_msr_hyperv_reenlightenment = {
> +    .name = "cpu/msr_hyperv_reenlightenment",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = hyperv_reenlightenment_enable_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT64(env.msr_hv_reenlightenment_control, X86CPU),
> +        VMSTATE_UINT64(env.msr_hv_tsc_emulation_control, X86CPU),
> +        VMSTATE_UINT64(env.msr_hv_tsc_emulation_status, X86CPU),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static bool avx512_needed(void *opaque)
>  {
>      X86CPU *cpu = opaque;
> @@ -1005,6 +1028,7 @@ VMStateDescription vmstate_x86_cpu = {
>          &vmstate_msr_hyperv_runtime,
>          &vmstate_msr_hyperv_synic,
>          &vmstate_msr_hyperv_stimer,
> +        &vmstate_msr_hyperv_reenlightenment,
>          &vmstate_avx512,
>          &vmstate_xss,
>          &vmstate_tsc_khz,
> -- 
> 2.14.3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs
  2018-03-22 17:09   ` Marcelo Tosatti
@ 2018-03-22 17:39     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 25+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-22 17:39 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Paolo Bonzini, Richard Henderson, Eduardo Habkost, qemu-devel,
	Roman Kagan

Marcelo Tosatti <mtosatti@redhat.com> writes:

> On Tue, Mar 20, 2018 at 06:34:59PM +0100, Vitaly Kuznetsov wrote:
>> KVM recently gained support for Hyper-V Reenlightenment MSRs which are
>> required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
>> when INVTSC is not passed to it (and it is not passed by default in Qemu
>> as it effectively blocks migration).
>
> Hi Vitaly,
>
> From Microsoft's documentation:
>
> "An L1 hypervisor can request to be notified when its partition is
> migrated. This capability is enumerated in CPUID as
> AccessReenlightenmentControls privilege (see 2.4.10)."
>
> The L0 hypervisor exposes a synthetic MSR
> (HV_X64_MSR_REENLIGHTENMENT_CONTROL) that may be used by the L1
> hypervisor to configure an interrupt vector and target processor. The L0
> hypervisor will inject an interrupt with the specified vector after each
> migration.
>
> What prevents a guest from setting the enable bit, and expect
> to receive an interrupt, if the reenlightenment MSRs are exposed ?
>

This is actually desired: Hyper-V on KVM will set this bit and expect to
receive an interrupt. Currently, we don't send it because we don't
migrate nested workloads but eventually, when we learn how to do this in
KVM, sending an interrupt and doint TSC access emulation will be required.

Normal Windows on KVM won't use the feature as it doesn't need it: upon
migration we update TSC page in KVM and readings from it stay correct.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-22 13:58               ` Roman Kagan
@ 2018-03-22 18:38                 ` Eduardo Habkost
  2018-03-23  9:45                   ` Roman Kagan
  0 siblings, 1 reply; 25+ messages in thread
From: Eduardo Habkost @ 2018-03-22 18:38 UTC (permalink / raw)
  To: Roman Kagan, Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Thu, Mar 22, 2018 at 04:58:03PM +0300, Roman Kagan wrote:
> On Thu, Mar 22, 2018 at 10:22:18AM -0300, Eduardo Habkost wrote:
> > On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> > > On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > > > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > > > 
> > > > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > > > >> these frequencies are stable, it just means they're available for reading.
> > > > > > >> 
> > > > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > > > >> ---
> > > > > > >>  target/i386/kvm.c | 2 +-
> > > > > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >> 
> > > > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > > > >> --- a/target/i386/kvm.c
> > > > > > >> +++ b/target/i386/kvm.c
> > > > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > > > >>  
> > > > > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > > > >>          }
> > > > > > >
> > > > > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > > > older one will make it crash.
> > > > > > >
> > > > > > 
> > > > > > This can be arranged, but what happens to people who use these features
> > > > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > > > they have...
> > > > > 
> > > > > I see two cases here:
> > > > > 
> > > > > 1) people start a new VM, and discover that their old configuration is
> > > > >    not enough for this feature to work.
> > > > > 
> > > > >    They need to reconfigure and restart the VM.  This costs them some
> > > > >    time investigating and restarting, but not data.
> > > > 
> > > > If we keep machine-type compatibility, people will need to do
> > > > that only if they change the machine-type (or use the "pc" or
> > > > "q35" aliases).  If they copy the old configuration, it will keep
> > > > working.
> > > 
> > > The problem is that the feature is not fixed by the machine-type, due to
> > > the forgotten property: it only depends on the KVM version.  So, once
> > > (if) we add the property and make the feature deterministic, we'll lose
> > > compatibility one way or another.
> > > 
> > > Or are you suggesting that for pre-2.12 machine types we leave the
> > > property at "decided by your KVM" state?
> > 
> > Yes, that's what I mean.  This looks like the only way to avoid
> > losing features by just cold-rebooting an existing VM.
> > 
> > The scenario I'm thinking is this:
> > 
> > 1) pc-2.11 VM started on host running QEMU 2.11
> > 2) VM migrated to a host containing this patch
> > 3) 1 year later, the VM is shut down and booted again.
> > 4) Things stop working inside the VM because hv-frequency is
> >    unexpectedly gone.
> > 
> > Machine-type compatibility code would avoid (4).
> 
> Right, but (4) typically means that you fail to start your workload from
> a clean state, so you just go and fix it; no data is lost.
> 
> Compare this to a migration to an older KVM which results in your guest
> crashing, where you risk data loss and still have to meddle with
> configs.

True. To make it worse, we are already unable to avoid this crash
on existing VMs without a reboot.  The only case where we can fix
this is if live-migration to older KVM happens after the guest
was rebooted when running on a newer QEMU version.  :(



> 
> > > > machine-type compatibility also makes the following case a bit
> > > > safer:
> > > > 
> > > > > 
> > > > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > > > >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > > > >    MSRs).
> > > > > 
> > > > >    With the current implementation in KVM, this will only result in the
> > > > >    feature bits disappearing from the respective CPUID leaf, but the
> > > > >    MSRs themselves will continue working as they used to.  So the guest
> > > > >    either won't notice or will check the CPUID and adjust.
> > > > 
> > > > If we keep machine-type compatibility, the CPUID bit won't
> > > > disappear for the guest while the MSRs keep working.
> > > > 
> > > > 
> > > > Whichever solution we choose, we can still have guests crashing
> > > > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > > > with an older kernel.  But I don't think there's a way out of
> > > > this, except requiring an explicit "hv-frequencies" CPU option on
> > > > newer machine-types.
> > > 
> > > What's wrong with requiring it, as we do for all other hv_* properties?
> > 
> > On new machine-types, nothing wrong.
> > 
> > On existing machine-types, see above.
> 
> I wonder if the following can cater to all relevant cases:
> 
> - hv_frequencies property is added, defaulting to "off", so that new
>   users of this feature would need to explicitly turn it on;
> 
> - on pre-2.12 machine types, it's set to the value of hv_time property
>   by the compat code, so that on VMs where this feature could
>   potentially be present it would become required; as a result, these
>   configurations will refuse to start on insufficiently capable KVM,
>   preventing the migration attempts.

This sounds like the safest option.  The cost will be the
inconvenience of being unable to run pc-2.11 on hosts with older
KVM (Linux < v4.14, without commit
72c139bacfa386145d7bbb68c47c8824716153b6), and the need to
explicitly enable hv-frequencies on pc-2.12 and newer.

> 
> Am I missing any scenarios that aren't covered?
> 

It looks like the guest can still crash if we migrate
"QEMU-2.12 -machine pc-2.11 -cpu ...,+hv-time" to a host running
QEMU 2.11 and Linux < 4.14.  I wonder if there's a way to avoid
that?  If there's a way to avoid that with extra migration
subsections, is it worth the effort/complexity?

-- 
Eduardo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-22 18:38                 ` Eduardo Habkost
@ 2018-03-23  9:45                   ` Roman Kagan
  2018-03-23 19:48                     ` Eduardo Habkost
  0 siblings, 1 reply; 25+ messages in thread
From: Roman Kagan @ 2018-03-23  9:45 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Thu, Mar 22, 2018 at 03:38:13PM -0300, Eduardo Habkost wrote:
> On Thu, Mar 22, 2018 at 04:58:03PM +0300, Roman Kagan wrote:
> > On Thu, Mar 22, 2018 at 10:22:18AM -0300, Eduardo Habkost wrote:
> > > On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> > > > On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > > > > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > > > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > > > > 
> > > > > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > > > > >> these frequencies are stable, it just means they're available for reading.
> > > > > > > >> 
> > > > > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > > > > >> ---
> > > > > > > >>  target/i386/kvm.c | 2 +-
> > > > > > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > >> 
> > > > > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > > > > >> --- a/target/i386/kvm.c
> > > > > > > >> +++ b/target/i386/kvm.c
> > > > > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > > > > >>  
> > > > > > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > > > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > > > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > > > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > > > > >>          }
> > > > > > > >
> > > > > > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > > > > older one will make it crash.
> > > > > > > >
> > > > > > > 
> > > > > > > This can be arranged, but what happens to people who use these features
> > > > > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > > > > they have...
> > > > > > 
> > > > > > I see two cases here:
> > > > > > 
> > > > > > 1) people start a new VM, and discover that their old configuration is
> > > > > >    not enough for this feature to work.
> > > > > > 
> > > > > >    They need to reconfigure and restart the VM.  This costs them some
> > > > > >    time investigating and restarting, but not data.
> > > > > 
> > > > > If we keep machine-type compatibility, people will need to do
> > > > > that only if they change the machine-type (or use the "pc" or
> > > > > "q35" aliases).  If they copy the old configuration, it will keep
> > > > > working.
> > > > 
> > > > The problem is that the feature is not fixed by the machine-type, due to
> > > > the forgotten property: it only depends on the KVM version.  So, once
> > > > (if) we add the property and make the feature deterministic, we'll lose
> > > > compatibility one way or another.
> > > > 
> > > > Or are you suggesting that for pre-2.12 machine types we leave the
> > > > property at "decided by your KVM" state?
> > > 
> > > Yes, that's what I mean.  This looks like the only way to avoid
> > > losing features by just cold-rebooting an existing VM.
> > > 
> > > The scenario I'm thinking is this:
> > > 
> > > 1) pc-2.11 VM started on host running QEMU 2.11
> > > 2) VM migrated to a host containing this patch
> > > 3) 1 year later, the VM is shut down and booted again.
> > > 4) Things stop working inside the VM because hv-frequency is
> > >    unexpectedly gone.
> > > 
> > > Machine-type compatibility code would avoid (4).
> > 
> > Right, but (4) typically means that you fail to start your workload from
> > a clean state, so you just go and fix it; no data is lost.
> > 
> > Compare this to a migration to an older KVM which results in your guest
> > crashing, where you risk data loss and still have to meddle with
> > configs.
> 
> True. To make it worse, we are already unable to avoid this crash
> on existing VMs without a reboot.  The only case where we can fix
> this is if live-migration to older KVM happens after the guest
> was rebooted when running on a newer QEMU version.  :(

Hmm, I thought the scheme I outlined below covered (== blocked) live
migration QEMU-2.11/KVM-4.14+ -> QEMU-2.12(machine-2.11)/KVM-4.13-,
didn't it?

> > > > > machine-type compatibility also makes the following case a bit
> > > > > safer:
> > > > > 
> > > > > > 
> > > > > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > > > > >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > > > > >    MSRs).
> > > > > > 
> > > > > >    With the current implementation in KVM, this will only result in the
> > > > > >    feature bits disappearing from the respective CPUID leaf, but the
> > > > > >    MSRs themselves will continue working as they used to.  So the guest
> > > > > >    either won't notice or will check the CPUID and adjust.
> > > > > 
> > > > > If we keep machine-type compatibility, the CPUID bit won't
> > > > > disappear for the guest while the MSRs keep working.
> > > > > 
> > > > > 
> > > > > Whichever solution we choose, we can still have guests crashing
> > > > > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > > > > with an older kernel.  But I don't think there's a way out of
> > > > > this, except requiring an explicit "hv-frequencies" CPU option on
> > > > > newer machine-types.
> > > > 
> > > > What's wrong with requiring it, as we do for all other hv_* properties?
> > > 
> > > On new machine-types, nothing wrong.
> > > 
> > > On existing machine-types, see above.
> > 
> > I wonder if the following can cater to all relevant cases:
> > 
> > - hv_frequencies property is added, defaulting to "off", so that new
> >   users of this feature would need to explicitly turn it on;
> > 
> > - on pre-2.12 machine types, it's set to the value of hv_time property
> >   by the compat code, so that on VMs where this feature could
> >   potentially be present it would become required; as a result, these
> >   configurations will refuse to start on insufficiently capable KVM,
> >   preventing the migration attempts.
> 
> This sounds like the safest option.  The cost will be the
> inconvenience of being unable to run pc-2.11 on hosts with older
> KVM (Linux < v4.14, without commit
> 72c139bacfa386145d7bbb68c47c8824716153b6),

not completely unable: people will have to add "hv_frequencies=off" to
their cpu spec

> and the need to explicitly enable hv-frequencies on pc-2.12 and newer.

which is the standard situation for all new features.

> > Am I missing any scenarios that aren't covered?
> > 
> 
> It looks like the guest can still crash if we migrate
> "QEMU-2.12 -machine pc-2.11 -cpu ...,+hv-time" to a host running
> QEMU 2.11 and Linux < 4.14.

Indeed :(

> I wonder if there's a way to avoid that?  If there's a way to avoid
> that with extra migration subsections,

I guess this should work.

> is it worth the effort/complexity?

This is a judgement call.  For vendors this is a non-issue because most
of them haven't even started shipping 2.11, so they just don't have VMs
with this problem in the field.

So, taking the effort/complexity vs safety tradeoff into account, we can
consider an alternative approach: just add hv_frequencies (default=off)
cpu property to 2.12 and 2.11-stable, and ignore the cases where it's
run on QEMU versions without explicit control over this feature.  Would
it be too much against the current policy?

Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-23  9:45                   ` Roman Kagan
@ 2018-03-23 19:48                     ` Eduardo Habkost
  2018-03-26 14:20                       ` Roman Kagan
  0 siblings, 1 reply; 25+ messages in thread
From: Eduardo Habkost @ 2018-03-23 19:48 UTC (permalink / raw)
  To: Roman Kagan, Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Fri, Mar 23, 2018 at 12:45:30PM +0300, Roman Kagan wrote:
> On Thu, Mar 22, 2018 at 03:38:13PM -0300, Eduardo Habkost wrote:
> > On Thu, Mar 22, 2018 at 04:58:03PM +0300, Roman Kagan wrote:
> > > On Thu, Mar 22, 2018 at 10:22:18AM -0300, Eduardo Habkost wrote:
> > > > On Thu, Mar 22, 2018 at 04:00:14PM +0300, Roman Kagan wrote:
> > > > > On Wed, Mar 21, 2018 at 05:19:24PM -0300, Eduardo Habkost wrote:
> > > > > > On Wed, Mar 21, 2018 at 07:57:29PM +0300, Roman Kagan wrote:
> > > > > > > On Wed, Mar 21, 2018 at 02:18:54PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > > > > > 
> > > > > > > > > On Tue, Mar 20, 2018 at 06:35:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > > > > >> Requiring tsc_is_stable_and_known() is too restrictive: even without INVTCS
> > > > > > > > >> nested Hyper-V-on-KVM enables TSC pages for its guests e.g. when
> > > > > > > > >> Reenlightenment MSRs are present. Presence of frequency MSRs doesn't mean
> > > > > > > > >> these frequencies are stable, it just means they're available for reading.
> > > > > > > > >> 
> > > > > > > > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > > > > > >> ---
> > > > > > > > >>  target/i386/kvm.c | 2 +-
> > > > > > > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > >> 
> > > > > > > > >> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > > > > > > >> index 7d9f9ca0b1..74fc3d3b2c 100644
> > > > > > > > >> --- a/target/i386/kvm.c
> > > > > > > > >> +++ b/target/i386/kvm.c
> > > > > > > > >> @@ -651,7 +651,7 @@ static int hyperv_handle_properties(CPUState *cs)
> > > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> > > > > > > > >>          env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> > > > > > > > >>  
> > > > > > > > >> -        if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
> > > > > > > > >> +        if (has_msr_hv_frequencies && env->tsc_khz) {
> > > > > > > > >>              env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> > > > > > > > >>              env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> > > > > > > > >>          }
> > > > > > > > >
> > > > > > > > > I suggest that we add a corresponding cpu property here, too.  The guest
> > > > > > > > > may legitimately rely on these msrs when it sees the support in CPUID,
> > > > > > > > > and migrating from a kernel with the feature supported (4.14+) to an
> > > > > > > > > older one will make it crash.
> > > > > > > > >
> > > > > > > > 
> > > > > > > > This can be arranged, but what happens to people who use these features
> > > > > > > > today? Assuming they also passed 'invtsc' they have stable TSC page
> > > > > > > > clocksource already (when Hyper-V role is enabled) but when we start
> > > > > > > > requesting a new 'hv_frequency' cpu property they'll suddenly lose what
> > > > > > > > they have...
> > > > > > > 
> > > > > > > I see two cases here:
> > > > > > > 
> > > > > > > 1) people start a new VM, and discover that their old configuration is
> > > > > > >    not enough for this feature to work.
> > > > > > > 
> > > > > > >    They need to reconfigure and restart the VM.  This costs them some
> > > > > > >    time investigating and restarting, but not data.
> > > > > > 
> > > > > > If we keep machine-type compatibility, people will need to do
> > > > > > that only if they change the machine-type (or use the "pc" or
> > > > > > "q35" aliases).  If they copy the old configuration, it will keep
> > > > > > working.
> > > > > 
> > > > > The problem is that the feature is not fixed by the machine-type, due to
> > > > > the forgotten property: it only depends on the KVM version.  So, once
> > > > > (if) we add the property and make the feature deterministic, we'll lose
> > > > > compatibility one way or another.
> > > > > 
> > > > > Or are you suggesting that for pre-2.12 machine types we leave the
> > > > > property at "decided by your KVM" state?
> > > > 
> > > > Yes, that's what I mean.  This looks like the only way to avoid
> > > > losing features by just cold-rebooting an existing VM.
> > > > 
> > > > The scenario I'm thinking is this:
> > > > 
> > > > 1) pc-2.11 VM started on host running QEMU 2.11
> > > > 2) VM migrated to a host containing this patch
> > > > 3) 1 year later, the VM is shut down and booted again.
> > > > 4) Things stop working inside the VM because hv-frequency is
> > > >    unexpectedly gone.
> > > > 
> > > > Machine-type compatibility code would avoid (4).
> > > 
> > > Right, but (4) typically means that you fail to start your workload from
> > > a clean state, so you just go and fix it; no data is lost.
> > > 
> > > Compare this to a migration to an older KVM which results in your guest
> > > crashing, where you risk data loss and still have to meddle with
> > > configs.
> > 
> > True. To make it worse, we are already unable to avoid this crash
> > on existing VMs without a reboot.  The only case where we can fix
> > this is if live-migration to older KVM happens after the guest
> > was rebooted when running on a newer QEMU version.  :(
> 
> Hmm, I thought the scheme I outlined below covered (== blocked) live
> migration QEMU-2.11/KVM-4.14+ -> QEMU-2.12(machine-2.11)/KVM-4.13-,
> didn't it?

It should, but what about migration
QEMU-2.12(pc-2.11)/KVM-4.14 -> QEMU-2.11(pc-2.11)/KVM-4.13?

Or, more specifically:
QEMU-2.11(pc-2.11)/KVM-4.14 ->
QEMU-2.12(pc-2.11)/KVM-4.14 -> QEMU-2.11(machine-2.11)/KVM-4.13?

> 
> > > > > > machine-type compatibility also makes the following case a bit
> > > > > > safer:
> > > > > > 
> > > > > > > 
> > > > > > > 2) people migrate from a QEMU without ->hv_frequency, to a new one with
> > > > > > >    ->hv_frequency=off (assuming on both ends KVM supports the frequency
> > > > > > >    MSRs).
> > > > > > > 
> > > > > > >    With the current implementation in KVM, this will only result in the
> > > > > > >    feature bits disappearing from the respective CPUID leaf, but the
> > > > > > >    MSRs themselves will continue working as they used to.  So the guest
> > > > > > >    either won't notice or will check the CPUID and adjust.
> > > > > > 
> > > > > > If we keep machine-type compatibility, the CPUID bit won't
> > > > > > disappear for the guest while the MSRs keep working.
> > > > > > 
> > > > > > 
> > > > > > Whichever solution we choose, we can still have guests crashing
> > > > > > if migrating a pc-2.11 machine from a 4.14+ host kernel to a host
> > > > > > with an older kernel.  But I don't think there's a way out of
> > > > > > this, except requiring an explicit "hv-frequencies" CPU option on
> > > > > > newer machine-types.
> > > > > 
> > > > > What's wrong with requiring it, as we do for all other hv_* properties?
> > > > 
> > > > On new machine-types, nothing wrong.
> > > > 
> > > > On existing machine-types, see above.
> > > 
> > > I wonder if the following can cater to all relevant cases:
> > > 
> > > - hv_frequencies property is added, defaulting to "off", so that new
> > >   users of this feature would need to explicitly turn it on;
> > > 
> > > - on pre-2.12 machine types, it's set to the value of hv_time property
> > >   by the compat code, so that on VMs where this feature could
> > >   potentially be present it would become required; as a result, these
> > >   configurations will refuse to start on insufficiently capable KVM,
> > >   preventing the migration attempts.
> > 
> > This sounds like the safest option.  The cost will be the
> > inconvenience of being unable to run pc-2.11 on hosts with older
> > KVM (Linux < v4.14, without commit
> > 72c139bacfa386145d7bbb68c47c8824716153b6),
> 
> not completely unable: people will have to add "hv_frequencies=off" to
> their cpu spec

This is different from the patch you sent, which sets
hv-frequencies=off by default on pc-2.11 too.

(And now I see you described this approach in the last paragraph below. :)

> 
> > and the need to explicitly enable hv-frequencies on pc-2.12 and newer.
> 
> which is the standard situation for all new features.

Yes, no question on what we want to do on pc-2.12.

> 
> > > Am I missing any scenarios that aren't covered?
> > > 
> > 
> > It looks like the guest can still crash if we migrate
> > "QEMU-2.12 -machine pc-2.11 -cpu ...,+hv-time" to a host running
> > QEMU 2.11 and Linux < 4.14.
> 
> Indeed :(

Well, your patch fixes it by not enabling hv-frequencies by
default on any machine-type.  Do you see any gotchas?


> 
> > I wonder if there's a way to avoid that?  If there's a way to avoid
> > that with extra migration subsections,
> 
> I guess this should work.
> 
> > is it worth the effort/complexity?
> 
> This is a judgement call.  For vendors this is a non-issue because most
> of them haven't even started shipping 2.11, so they just don't have VMs
> with this problem in the field.
> 
> So, taking the effort/complexity vs safety tradeoff into account, we can
> consider an alternative approach: just add hv_frequencies (default=off)
> cpu property to 2.12 and 2.11-stable, and ignore the cases where it's
> run on QEMU versions without explicit control over this feature.  Would
> it be too much against the current policy?
> 

With this, we can just declare that QEMU v2.11.0 + Linux 4.14+
was broken, and advise people to upgrade QEMU.

I think is the most reasonable option we have.  See my reply to
the patch you sent.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure
  2018-03-23 19:48                     ` Eduardo Habkost
@ 2018-03-26 14:20                       ` Roman Kagan
  0 siblings, 0 replies; 25+ messages in thread
From: Roman Kagan @ 2018-03-26 14:20 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Richard Henderson,
	Marcelo Tosatti, qemu-devel

On Fri, Mar 23, 2018 at 04:48:27PM -0300, Eduardo Habkost wrote:
> On Fri, Mar 23, 2018 at 12:45:30PM +0300, Roman Kagan wrote:
> > On Thu, Mar 22, 2018 at 03:38:13PM -0300, Eduardo Habkost wrote:
> > > It looks like the guest can still crash if we migrate
> > > "QEMU-2.12 -machine pc-2.11 -cpu ...,+hv-time" to a host running
> > > QEMU 2.11 and Linux < 4.14.
> > 
> > Indeed :(
> 
> Well, your patch fixes it by not enabling hv-frequencies by
> default on any machine-type.  Do you see any gotchas?

Only the need to add a new property in -stable.  I used to think that
was frowned upon.

> > So, taking the effort/complexity vs safety tradeoff into account, we can
> > consider an alternative approach: just add hv_frequencies (default=off)
> > cpu property to 2.12 and 2.11-stable, and ignore the cases where it's
> > run on QEMU versions without explicit control over this feature.  Would
> > it be too much against the current policy?
> > 
> 
> With this, we can just declare that QEMU v2.11.0 + Linux 4.14+
> was broken, and advise people to upgrade QEMU.
> 
> I think is the most reasonable option we have.  See my reply to
> the patch you sent.

Great, let's try going down this route.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-03-26 14:21 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-20 17:34 [Qemu-devel] [PATCH v3 0/2] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes Vitaly Kuznetsov
2018-03-20 17:34 ` [Qemu-devel] [PATCH v3 1/2] i386/kvm: add support for Hyper-V reenlightenment MSRs Vitaly Kuznetsov
2018-03-20 18:32   ` Eduardo Habkost
2018-03-21 11:09     ` Roman Kagan
2018-03-21 13:09     ` Roman Kagan
2018-03-21 11:24   ` Roman Kagan
2018-03-22 17:09   ` Marcelo Tosatti
2018-03-22 17:39     ` Vitaly Kuznetsov
2018-03-20 17:35 ` [Qemu-devel] [PATCH v3 2/2] i386/kvm: lower requirements for Hyper-V frequency MSRs exposure Vitaly Kuznetsov
2018-03-21 12:10   ` Roman Kagan
2018-03-21 13:18     ` Vitaly Kuznetsov
2018-03-21 16:57       ` Roman Kagan
2018-03-21 20:19         ` Eduardo Habkost
2018-03-22 13:00           ` Roman Kagan
2018-03-22 13:22             ` Eduardo Habkost
2018-03-22 13:58               ` Roman Kagan
2018-03-22 18:38                 ` Eduardo Habkost
2018-03-23  9:45                   ` Roman Kagan
2018-03-23 19:48                     ` Eduardo Habkost
2018-03-26 14:20                       ` Roman Kagan
2018-03-21 15:33   ` Paolo Bonzini
2018-03-21 16:17     ` Vitaly Kuznetsov
2018-03-21 17:17       ` Roman Kagan
2018-03-21 20:06         ` Eduardo Habkost
2018-03-21 16:47     ` Roman Kagan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.