All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [PATCH 1/8] i386/kvm: add support for KVM_GET_SUPPORTED_HV_CPUID
@ 2019-04-05 13:05     ` Roman Kagan
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Kagan @ 2019-04-05 13:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, Dr . David Alan Gilbert,
	Daniel P . Berrangé

On Fri, Mar 29, 2019 at 03:18:25PM +0100, Vitaly Kuznetsov wrote:
> KVM now supports reporting supported Hyper-V features through CPUID
> (KVM_GET_SUPPORTED_HV_CPUID ioctl). Going forward, this is going to be
> the only way to announce new functionality and this has already happened
> with Direct Mode stimers.
> 
> While we could just support KVM_GET_SUPPORTED_HV_CPUID for new features,
> it seems to be beneficial to use it for all Hyper-V enlightenments when
> possible. This way we can implement 'hv-all' pass-through mode giving the
> guest all supported Hyper-V features even when QEMU knows nothing about
> them.
> 
> Implementation-wise we create a new kvm_hyperv_properties structure
> defining Hyper-V features, get_supported_hv_cpuid()/
> get_supported_hv_cpuid_legacy() returning the supported CPUID set and
> a bit over-engineered hv_cpuid_check_and_set() which we will also be
> used to set cpu->hyperv_* properties for 'hv-all' mode.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  target/i386/kvm.c | 487 +++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 372 insertions(+), 115 deletions(-)
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 3b29ce5c0d..9abee81998 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -700,141 +700,360 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
>          || env->user_tsc_khz;
>  }
>  
> -static int hyperv_handle_properties(CPUState *cs)
> +static struct {
> +    const char *name;
> +    const char *desc;
> +    struct {
> +        uint32_t fw;
> +        uint32_t bits;
> +    } flags[2];
> +} kvm_hyperv_properties[] = {
> +    {
> +        .name = "hv-relaxed",
> +        .desc = "relaxed timing",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_HYPERCALL_AVAILABLE},
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_RELAXED_TIMING_RECOMMENDED}
> +        }
> +    },
> +    {
> +        .name = "hv-vapic",
> +        .desc = "virtual APIC",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_HYPERCALL_AVAILABLE | HV_APIC_ACCESS_AVAILABLE},
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_APIC_ACCESS_RECOMMENDED}
> +        }
> +    },
> +    {
> +        .name = "hv-time",
> +        .desc = "clocksources",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_HYPERCALL_AVAILABLE | HV_TIME_REF_COUNT_AVAILABLE |
> +             HV_REFERENCE_TSC_AVAILABLE},
> +            {0}

IIRC explicit zero initializer can be omitted here: all fields that have
no explicit initializers are zeroed.

> +        }
> +    },
> +    {
> +        .name = "hv-frequencies",
> +        .desc = "frequency MSRs",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_ACCESS_FREQUENCY_MSRS},
> +            {.fw = FEAT_HYPERV_EDX,
> +             .bits = HV_FREQUENCY_MSRS_AVAILABLE}
> +        }
> +    },
> +    {
> +        .name = "hv-crash",
> +        .desc = "crash MSRs",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EDX,
> +             .bits = HV_GUEST_CRASH_MSR_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-reenlightenment",
> +        .desc = "Reenlightenment MSRs",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_ACCESS_REENLIGHTENMENTS_CONTROL},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-reset",
> +        .desc = "reset MSR",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_RESET_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-vpindex",
> +        .desc = "VP_INDEX MSR",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_VP_INDEX_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-runtime",
> +        .desc = "VP_RUNTIME MSR",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_VP_RUNTIME_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-synic",
> +        .desc = "SynIC",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_SYNIC_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-stimer",
> +        .desc = "timers",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_SYNTIMERS_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-tlbflush",
> +        .desc = "TLB flush support",
> +        .flags = {
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_REMOTE_TLB_FLUSH_RECOMMENDED |
> +             HV_EX_PROCESSOR_MASKS_RECOMMENDED},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-ipi",
> +        .desc = "IPI send support",
> +        .flags = {
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_CLUSTER_IPI_RECOMMENDED |
> +             HV_EX_PROCESSOR_MASKS_RECOMMENDED},
> +            {0}
> +        }
> +    },
> +};
> +
> +static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
> +{
> +    struct kvm_cpuid2 *cpuid;
> +    int r, size;
> +
> +    size = sizeof(*cpuid) + max * sizeof(*cpuid->entries);
> +    cpuid = g_malloc0(size);
> +    cpuid->nent = max;
> +
> +    r = kvm_vcpu_ioctl(cs, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
> +    if (r == 0 && cpuid->nent >= max) {
> +        r = -E2BIG;
> +    }
> +    if (r < 0) {
> +        if (r == -E2BIG) {
> +            g_free(cpuid);
> +            return NULL;
> +        } else {
> +            fprintf(stderr, "KVM_GET_SUPPORTED_HV_CPUID failed: %s\n",
> +                    strerror(-r));
> +            exit(1);
> +        }
> +    }
> +    return cpuid;
> +}
> +
> +/*
> + * Run KVM_GET_SUPPORTED_HV_CPUID ioctl(), allocating a buffer large enough
> + * for all entries.
> + */
> +static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
> +{
> +    struct kvm_cpuid2 *cpuid;
> +    int max = 7; /* 0x40000000..0x40000005, 0x4000000A */
> +
> +    while ((cpuid = try_get_hv_cpuid(cs, max)) == NULL) {
> +        max++;
> +    }

If you didn't drop kernel-provided cpuid->nent on the floor in the
previous function you wouldn't need to iterate more than once.

> +    return cpuid;
> +}
> +
> +/*
> + * When KVM_GET_SUPPORTED_HV_CPUID is not supported we fill CPUID feature
> + * leaves from KVM_CAP_HYPERV* and present MSRs data.
> + */
> +static struct kvm_cpuid2 *get_supported_hv_cpuid_legacy(CPUState *cs)
>  {
>      X86CPU *cpu = X86_CPU(cs);
> -    CPUX86State *env = &cpu->env;
> +    struct kvm_cpuid2 *cpuid;
> +    struct kvm_cpuid_entry2 *entry_feat, *entry_recomm;
> +
> +    /* HV_CPUID_FEATURES, HV_CPUID_ENLIGHTMENT_INFO */
> +    cpuid = g_malloc0(sizeof(*cpuid) + 2 * sizeof(*cpuid->entries));
> +    cpuid->nent = 2;
> +
> +    /* HV_CPUID_VENDOR_AND_MAX_FUNCTIONS */
> +    entry_feat = &cpuid->entries[0];
> +    entry_feat->function = HV_CPUID_FEATURES;
> +
> +    entry_recomm = &cpuid->entries[1];
> +    entry_recomm->function = HV_CPUID_ENLIGHTMENT_INFO;
>  
> -    if (cpu->hyperv_relaxed_timing) {
> -        env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
> +    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV) > 0) {
> +        entry_feat->eax |= HV_HYPERCALL_AVAILABLE;
> +        entry_feat->eax |= HV_APIC_ACCESS_AVAILABLE;
> +        entry_feat->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
> +        entry_recomm->eax |= HV_RELAXED_TIMING_RECOMMENDED;
> +        entry_recomm->eax |= HV_APIC_ACCESS_RECOMMENDED;
>      }
> -    if (cpu->hyperv_vapic) {
> -        env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
> -        env->features[FEAT_HYPERV_EAX] |= HV_APIC_ACCESS_AVAILABLE;
> +
> +    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_TIME) > 0) {
> +        entry_feat->eax |= HV_TIME_REF_COUNT_AVAILABLE;
> +        entry_feat->eax |= HV_REFERENCE_TSC_AVAILABLE;
>      }
> -    if (cpu->hyperv_time) {
> -        if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_TIME) <= 0) {
> -            fprintf(stderr, "Hyper-V clocksources "
> -                    "(requested by 'hv-time' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
> -        env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> -        env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> +
> +    if (has_msr_hv_frequencies) {
> +        entry_feat->eax |= HV_ACCESS_FREQUENCY_MSRS;
> +        entry_feat->edx |= HV_FREQUENCY_MSRS_AVAILABLE;
>      }
> -    if (cpu->hyperv_frequencies) {
> -        if (!has_msr_hv_frequencies) {
> -            fprintf(stderr, "Hyper-V frequency MSRs "
> -                    "(requested by 'hv-frequencies' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> -        env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> +
> +    if (has_msr_hv_crash) {
> +        entry_feat->edx |= HV_GUEST_CRASH_MSR_AVAILABLE;
>      }
> -    if (cpu->hyperv_crash) {
> -        if (!has_msr_hv_crash) {
> -            fprintf(stderr, "Hyper-V crash MSRs "
> -                    "(requested by 'hv-crash' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
> +
> +    if (has_msr_hv_reenlightenment) {
> +        entry_feat->eax |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
>      }
> -    if (cpu->hyperv_reenlightenment) {
> -        if (!has_msr_hv_reenlightenment) {
> -            fprintf(stderr,
> -                    "Hyper-V Reenlightenment MSRs "
> -                    "(requested by 'hv-reenlightenment' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
> +
> +    if (has_msr_hv_reset) {
> +        entry_feat->eax |= HV_RESET_AVAILABLE;
>      }
> -    env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
> -    if (cpu->hyperv_reset) {
> -        if (!has_msr_hv_reset) {
> -            fprintf(stderr, "Hyper-V reset MSR "
> -                    "(requested by 'hv-reset' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_RESET_AVAILABLE;
> +
> +    if (has_msr_hv_vpindex) {
> +        entry_feat->eax |= HV_VP_INDEX_AVAILABLE;
>      }
> -    if (cpu->hyperv_vpindex) {
> -        if (!has_msr_hv_vpindex) {
> -            fprintf(stderr, "Hyper-V VP_INDEX MSR "
> -                    "(requested by 'hv-vpindex' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_VP_INDEX_AVAILABLE;
> +
> +    if (has_msr_hv_runtime) {
> +        entry_feat->eax |= HV_VP_RUNTIME_AVAILABLE;
>      }
> -    if (cpu->hyperv_runtime) {
> -        if (!has_msr_hv_runtime) {
> -            fprintf(stderr, "Hyper-V VP_RUNTIME MSR "
> -                    "(requested by 'hv-runtime' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> +
> +    if (has_msr_hv_synic) {
> +        unsigned int cap = cpu->hyperv_synic_kvm_only ?
> +            KVM_CAP_HYPERV_SYNIC : KVM_CAP_HYPERV_SYNIC2;
> +
> +        if (kvm_check_extension(cs->kvm_state, cap) > 0) {
> +            entry_feat->eax |= HV_SYNIC_AVAILABLE;
>          }
> -        env->features[FEAT_HYPERV_EAX] |= HV_VP_RUNTIME_AVAILABLE;
>      }
> -    if (cpu->hyperv_synic) {
> -        unsigned int cap = KVM_CAP_HYPERV_SYNIC;
> -        if (!cpu->hyperv_synic_kvm_only) {
> -            if (!cpu->hyperv_vpindex) {
> -                fprintf(stderr, "Hyper-V SynIC "
> -                        "(requested by 'hv-synic' cpu flag) "
> -                        "requires Hyper-V VP_INDEX ('hv-vpindex')\n");
> -            return -ENOSYS;
> -            }
> -            cap = KVM_CAP_HYPERV_SYNIC2;
> -        }
>  
> -        if (!has_msr_hv_synic || !kvm_check_extension(cs->kvm_state, cap)) {
> -            fprintf(stderr, "Hyper-V SynIC (requested by 'hv-synic' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> +    if (has_msr_hv_stimer) {
> +        entry_feat->eax |= HV_SYNTIMERS_AVAILABLE;
> +    }
>  
> -        env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE;
> +    if (kvm_check_extension(cs->kvm_state,
> +                            KVM_CAP_HYPERV_TLBFLUSH) > 0) {
> +        entry_recomm->eax |= HV_REMOTE_TLB_FLUSH_RECOMMENDED;
> +        entry_recomm->eax |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
>      }
> -    if (cpu->hyperv_stimer) {
> -        if (!has_msr_hv_stimer) {
> -            fprintf(stderr, "Hyper-V timers aren't supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_SYNTIMERS_AVAILABLE;
> +
> +    if (kvm_check_extension(cs->kvm_state,
> +                            KVM_CAP_HYPERV_SEND_IPI) > 0) {
> +        entry_recomm->eax |= HV_CLUSTER_IPI_RECOMMENDED;
> +        entry_recomm->eax |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
>      }
> -    if (cpu->hyperv_relaxed_timing) {
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_RELAXED_TIMING_RECOMMENDED;
> +
> +    return cpuid;
> +}
> +
> +static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
> +{
> +    struct kvm_cpuid_entry2 *entry;
> +    uint32_t func;
> +    int reg;
> +
> +    switch (fw) {
> +    case FEAT_HYPERV_EAX:
> +        reg = R_EAX;
> +        func = HV_CPUID_FEATURES;
> +        break;
> +    case FEAT_HYPERV_EDX:
> +        reg = R_EDX;
> +        func = HV_CPUID_FEATURES;
> +        break;
> +    case FEAT_HV_RECOMM_EAX:
> +        reg = R_EAX;
> +        func = HV_CPUID_ENLIGHTMENT_INFO;
> +        break;
> +    default:
> +        return -EINVAL;
>      }
> -    if (cpu->hyperv_vapic) {
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_APIC_ACCESS_RECOMMENDED;
> -    }
> -    if (cpu->hyperv_tlbflush) {
> -        if (kvm_check_extension(cs->kvm_state,
> -                                KVM_CAP_HYPERV_TLBFLUSH) <= 0) {
> -            fprintf(stderr, "Hyper-V TLB flush support "
> -                    "(requested by 'hv-tlbflush' cpu flag) "
> -                    " is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_REMOTE_TLB_FLUSH_RECOMMENDED;
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
> +
> +    entry = cpuid_find_entry(cpuid, func, 0);
> +    if (!entry) {
> +        return -ENOENT;
>      }
> -    if (cpu->hyperv_ipi) {
> -        if (kvm_check_extension(cs->kvm_state,
> -                                KVM_CAP_HYPERV_SEND_IPI) <= 0) {
> -            fprintf(stderr, "Hyper-V IPI send support "
> -                    "(requested by 'hv-ipi' cpu flag) "
> -                    " is not supported by kernel\n");
> -            return -ENOSYS;
> +
> +    switch (reg) {
> +    case R_EAX:
> +        *r = entry->eax;
> +        break;
> +    case R_EDX:
> +        *r = entry->edx;
> +        break;
> +    default:
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
> +                                  const char *name, bool flag)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +    uint32_t r, fw, bits;;
> +    int i, j;
> +
> +    if (!flag) {
> +        return 0;
> +    }
> +
> +    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties); i++) {
> +        if (strcmp(kvm_hyperv_properties[i].name, name)) {
> +            continue;
>          }
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_CLUSTER_IPI_RECOMMENDED;
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
> +
> +        for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
> +            fw = kvm_hyperv_properties[i].flags[j].fw;
> +            bits = kvm_hyperv_properties[i].flags[j].bits;
> +
> +            if (!fw) {
> +                continue;
> +            }
> +
> +            if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
> +                fprintf(stderr,
> +                        "Hyper-V %s (requested by '%s' cpu flag) "
> +                        "is not supported by kernel\n",
> +                        kvm_hyperv_properties[i].desc,
> +                        kvm_hyperv_properties[i].name);
> +                return 1;
> +            }
> +
> +            env->features[fw] |= bits;
> +        }
> +
> +        return 0;
>      }
> +
> +    /* the requested feature is undefined in kvm_hyperv_properties */
> +    return 1;
> +}
> +
> +static int hyperv_handle_properties(CPUState *cs)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +    struct kvm_cpuid2 *cpuid;
> +    int r = 0;
> +
>      if (cpu->hyperv_evmcs) {
>          uint16_t evmcs_version;
>  
> @@ -849,7 +1068,45 @@ static int hyperv_handle_properties(CPUState *cs)
>          env->features[FEAT_HV_NESTED_EAX] = evmcs_version;
>      }
>  
> -    return 0;
> +    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_CPUID) > 0) {
> +        cpuid = get_supported_hv_cpuid(cs);
> +    } else {
> +        cpuid = get_supported_hv_cpuid_legacy(cs);
> +    }
> +
> +    /* Features */
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-relaxed",
> +                                cpu->hyperv_relaxed_timing);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vapic", cpu->hyperv_vapic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-time", cpu->hyperv_time);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-frequencies",
> +                                cpu->hyperv_frequencies);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-crash", cpu->hyperv_crash);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reenlightenment",
> +                                cpu->hyperv_reenlightenment);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reset", cpu->hyperv_reset);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vpindex", cpu->hyperv_vpindex);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-runtime", cpu->hyperv_runtime);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-synic", cpu->hyperv_synic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-stimer", cpu->hyperv_stimer);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-tlbflush", cpu->hyperv_tlbflush);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-ipi", cpu->hyperv_ipi);

So this duplicates the link between the properties and their names,
originally established by DEFINE_PROP_BOOL(...).  Not good.

The property names are also duplicated in your kvm_hyperv_properties
array.  Not good either.

I'm wondering if a better solution could be to replace the boolean
properties with bit properties within a single 64bit word, and to have
the bit position of property be also the index into the property
descriptor array.  This should allow to DRY when defining the cpu
properties and then validating for their presence.  Besides this will
allow to easily express their interdependencies: every property will
just get an extra field with the mask of prerequisite properties.

I'll see if I can prototype something along these lines early next week
unless you beat me to it.

Anyway this mess that has accumulated around hyperv-related properties
is in need for a spring cleanup since long; thanks a lot for looking
into it!

Roman.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] i386/kvm: add support for KVM_GET_SUPPORTED_HV_CPUID
@ 2019-04-05 13:05     ` Roman Kagan
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Kagan @ 2019-04-05 13:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Eduardo Habkost, Marcelo Tosatti, Dr . David Alan Gilbert,
	qemu-devel, Paolo Bonzini, Richard Henderson

On Fri, Mar 29, 2019 at 03:18:25PM +0100, Vitaly Kuznetsov wrote:
> KVM now supports reporting supported Hyper-V features through CPUID
> (KVM_GET_SUPPORTED_HV_CPUID ioctl). Going forward, this is going to be
> the only way to announce new functionality and this has already happened
> with Direct Mode stimers.
> 
> While we could just support KVM_GET_SUPPORTED_HV_CPUID for new features,
> it seems to be beneficial to use it for all Hyper-V enlightenments when
> possible. This way we can implement 'hv-all' pass-through mode giving the
> guest all supported Hyper-V features even when QEMU knows nothing about
> them.
> 
> Implementation-wise we create a new kvm_hyperv_properties structure
> defining Hyper-V features, get_supported_hv_cpuid()/
> get_supported_hv_cpuid_legacy() returning the supported CPUID set and
> a bit over-engineered hv_cpuid_check_and_set() which we will also be
> used to set cpu->hyperv_* properties for 'hv-all' mode.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  target/i386/kvm.c | 487 +++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 372 insertions(+), 115 deletions(-)
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 3b29ce5c0d..9abee81998 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -700,141 +700,360 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
>          || env->user_tsc_khz;
>  }
>  
> -static int hyperv_handle_properties(CPUState *cs)
> +static struct {
> +    const char *name;
> +    const char *desc;
> +    struct {
> +        uint32_t fw;
> +        uint32_t bits;
> +    } flags[2];
> +} kvm_hyperv_properties[] = {
> +    {
> +        .name = "hv-relaxed",
> +        .desc = "relaxed timing",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_HYPERCALL_AVAILABLE},
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_RELAXED_TIMING_RECOMMENDED}
> +        }
> +    },
> +    {
> +        .name = "hv-vapic",
> +        .desc = "virtual APIC",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_HYPERCALL_AVAILABLE | HV_APIC_ACCESS_AVAILABLE},
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_APIC_ACCESS_RECOMMENDED}
> +        }
> +    },
> +    {
> +        .name = "hv-time",
> +        .desc = "clocksources",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_HYPERCALL_AVAILABLE | HV_TIME_REF_COUNT_AVAILABLE |
> +             HV_REFERENCE_TSC_AVAILABLE},
> +            {0}

IIRC explicit zero initializer can be omitted here: all fields that have
no explicit initializers are zeroed.

> +        }
> +    },
> +    {
> +        .name = "hv-frequencies",
> +        .desc = "frequency MSRs",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_ACCESS_FREQUENCY_MSRS},
> +            {.fw = FEAT_HYPERV_EDX,
> +             .bits = HV_FREQUENCY_MSRS_AVAILABLE}
> +        }
> +    },
> +    {
> +        .name = "hv-crash",
> +        .desc = "crash MSRs",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EDX,
> +             .bits = HV_GUEST_CRASH_MSR_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-reenlightenment",
> +        .desc = "Reenlightenment MSRs",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_ACCESS_REENLIGHTENMENTS_CONTROL},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-reset",
> +        .desc = "reset MSR",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_RESET_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-vpindex",
> +        .desc = "VP_INDEX MSR",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_VP_INDEX_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-runtime",
> +        .desc = "VP_RUNTIME MSR",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_VP_RUNTIME_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-synic",
> +        .desc = "SynIC",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_SYNIC_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-stimer",
> +        .desc = "timers",
> +        .flags = {
> +            {.fw = FEAT_HYPERV_EAX,
> +             .bits = HV_SYNTIMERS_AVAILABLE},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-tlbflush",
> +        .desc = "TLB flush support",
> +        .flags = {
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_REMOTE_TLB_FLUSH_RECOMMENDED |
> +             HV_EX_PROCESSOR_MASKS_RECOMMENDED},
> +            {0}
> +        }
> +    },
> +    {
> +        .name = "hv-ipi",
> +        .desc = "IPI send support",
> +        .flags = {
> +            {.fw = FEAT_HV_RECOMM_EAX,
> +             .bits = HV_CLUSTER_IPI_RECOMMENDED |
> +             HV_EX_PROCESSOR_MASKS_RECOMMENDED},
> +            {0}
> +        }
> +    },
> +};
> +
> +static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
> +{
> +    struct kvm_cpuid2 *cpuid;
> +    int r, size;
> +
> +    size = sizeof(*cpuid) + max * sizeof(*cpuid->entries);
> +    cpuid = g_malloc0(size);
> +    cpuid->nent = max;
> +
> +    r = kvm_vcpu_ioctl(cs, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
> +    if (r == 0 && cpuid->nent >= max) {
> +        r = -E2BIG;
> +    }
> +    if (r < 0) {
> +        if (r == -E2BIG) {
> +            g_free(cpuid);
> +            return NULL;
> +        } else {
> +            fprintf(stderr, "KVM_GET_SUPPORTED_HV_CPUID failed: %s\n",
> +                    strerror(-r));
> +            exit(1);
> +        }
> +    }
> +    return cpuid;
> +}
> +
> +/*
> + * Run KVM_GET_SUPPORTED_HV_CPUID ioctl(), allocating a buffer large enough
> + * for all entries.
> + */
> +static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
> +{
> +    struct kvm_cpuid2 *cpuid;
> +    int max = 7; /* 0x40000000..0x40000005, 0x4000000A */
> +
> +    while ((cpuid = try_get_hv_cpuid(cs, max)) == NULL) {
> +        max++;
> +    }

If you didn't drop kernel-provided cpuid->nent on the floor in the
previous function you wouldn't need to iterate more than once.

> +    return cpuid;
> +}
> +
> +/*
> + * When KVM_GET_SUPPORTED_HV_CPUID is not supported we fill CPUID feature
> + * leaves from KVM_CAP_HYPERV* and present MSRs data.
> + */
> +static struct kvm_cpuid2 *get_supported_hv_cpuid_legacy(CPUState *cs)
>  {
>      X86CPU *cpu = X86_CPU(cs);
> -    CPUX86State *env = &cpu->env;
> +    struct kvm_cpuid2 *cpuid;
> +    struct kvm_cpuid_entry2 *entry_feat, *entry_recomm;
> +
> +    /* HV_CPUID_FEATURES, HV_CPUID_ENLIGHTMENT_INFO */
> +    cpuid = g_malloc0(sizeof(*cpuid) + 2 * sizeof(*cpuid->entries));
> +    cpuid->nent = 2;
> +
> +    /* HV_CPUID_VENDOR_AND_MAX_FUNCTIONS */
> +    entry_feat = &cpuid->entries[0];
> +    entry_feat->function = HV_CPUID_FEATURES;
> +
> +    entry_recomm = &cpuid->entries[1];
> +    entry_recomm->function = HV_CPUID_ENLIGHTMENT_INFO;
>  
> -    if (cpu->hyperv_relaxed_timing) {
> -        env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
> +    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV) > 0) {
> +        entry_feat->eax |= HV_HYPERCALL_AVAILABLE;
> +        entry_feat->eax |= HV_APIC_ACCESS_AVAILABLE;
> +        entry_feat->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
> +        entry_recomm->eax |= HV_RELAXED_TIMING_RECOMMENDED;
> +        entry_recomm->eax |= HV_APIC_ACCESS_RECOMMENDED;
>      }
> -    if (cpu->hyperv_vapic) {
> -        env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
> -        env->features[FEAT_HYPERV_EAX] |= HV_APIC_ACCESS_AVAILABLE;
> +
> +    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_TIME) > 0) {
> +        entry_feat->eax |= HV_TIME_REF_COUNT_AVAILABLE;
> +        entry_feat->eax |= HV_REFERENCE_TSC_AVAILABLE;
>      }
> -    if (cpu->hyperv_time) {
> -        if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_TIME) <= 0) {
> -            fprintf(stderr, "Hyper-V clocksources "
> -                    "(requested by 'hv-time' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
> -        env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
> -        env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
> +
> +    if (has_msr_hv_frequencies) {
> +        entry_feat->eax |= HV_ACCESS_FREQUENCY_MSRS;
> +        entry_feat->edx |= HV_FREQUENCY_MSRS_AVAILABLE;
>      }
> -    if (cpu->hyperv_frequencies) {
> -        if (!has_msr_hv_frequencies) {
> -            fprintf(stderr, "Hyper-V frequency MSRs "
> -                    "(requested by 'hv-frequencies' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_FREQUENCY_MSRS;
> -        env->features[FEAT_HYPERV_EDX] |= HV_FREQUENCY_MSRS_AVAILABLE;
> +
> +    if (has_msr_hv_crash) {
> +        entry_feat->edx |= HV_GUEST_CRASH_MSR_AVAILABLE;
>      }
> -    if (cpu->hyperv_crash) {
> -        if (!has_msr_hv_crash) {
> -            fprintf(stderr, "Hyper-V crash MSRs "
> -                    "(requested by 'hv-crash' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
> +
> +    if (has_msr_hv_reenlightenment) {
> +        entry_feat->eax |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
>      }
> -    if (cpu->hyperv_reenlightenment) {
> -        if (!has_msr_hv_reenlightenment) {
> -            fprintf(stderr,
> -                    "Hyper-V Reenlightenment MSRs "
> -                    "(requested by 'hv-reenlightenment' cpu flag) "
> -                    "are not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
> +
> +    if (has_msr_hv_reset) {
> +        entry_feat->eax |= HV_RESET_AVAILABLE;
>      }
> -    env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
> -    if (cpu->hyperv_reset) {
> -        if (!has_msr_hv_reset) {
> -            fprintf(stderr, "Hyper-V reset MSR "
> -                    "(requested by 'hv-reset' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_RESET_AVAILABLE;
> +
> +    if (has_msr_hv_vpindex) {
> +        entry_feat->eax |= HV_VP_INDEX_AVAILABLE;
>      }
> -    if (cpu->hyperv_vpindex) {
> -        if (!has_msr_hv_vpindex) {
> -            fprintf(stderr, "Hyper-V VP_INDEX MSR "
> -                    "(requested by 'hv-vpindex' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_VP_INDEX_AVAILABLE;
> +
> +    if (has_msr_hv_runtime) {
> +        entry_feat->eax |= HV_VP_RUNTIME_AVAILABLE;
>      }
> -    if (cpu->hyperv_runtime) {
> -        if (!has_msr_hv_runtime) {
> -            fprintf(stderr, "Hyper-V VP_RUNTIME MSR "
> -                    "(requested by 'hv-runtime' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> +
> +    if (has_msr_hv_synic) {
> +        unsigned int cap = cpu->hyperv_synic_kvm_only ?
> +            KVM_CAP_HYPERV_SYNIC : KVM_CAP_HYPERV_SYNIC2;
> +
> +        if (kvm_check_extension(cs->kvm_state, cap) > 0) {
> +            entry_feat->eax |= HV_SYNIC_AVAILABLE;
>          }
> -        env->features[FEAT_HYPERV_EAX] |= HV_VP_RUNTIME_AVAILABLE;
>      }
> -    if (cpu->hyperv_synic) {
> -        unsigned int cap = KVM_CAP_HYPERV_SYNIC;
> -        if (!cpu->hyperv_synic_kvm_only) {
> -            if (!cpu->hyperv_vpindex) {
> -                fprintf(stderr, "Hyper-V SynIC "
> -                        "(requested by 'hv-synic' cpu flag) "
> -                        "requires Hyper-V VP_INDEX ('hv-vpindex')\n");
> -            return -ENOSYS;
> -            }
> -            cap = KVM_CAP_HYPERV_SYNIC2;
> -        }
>  
> -        if (!has_msr_hv_synic || !kvm_check_extension(cs->kvm_state, cap)) {
> -            fprintf(stderr, "Hyper-V SynIC (requested by 'hv-synic' cpu flag) "
> -                    "is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> +    if (has_msr_hv_stimer) {
> +        entry_feat->eax |= HV_SYNTIMERS_AVAILABLE;
> +    }
>  
> -        env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE;
> +    if (kvm_check_extension(cs->kvm_state,
> +                            KVM_CAP_HYPERV_TLBFLUSH) > 0) {
> +        entry_recomm->eax |= HV_REMOTE_TLB_FLUSH_RECOMMENDED;
> +        entry_recomm->eax |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
>      }
> -    if (cpu->hyperv_stimer) {
> -        if (!has_msr_hv_stimer) {
> -            fprintf(stderr, "Hyper-V timers aren't supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HYPERV_EAX] |= HV_SYNTIMERS_AVAILABLE;
> +
> +    if (kvm_check_extension(cs->kvm_state,
> +                            KVM_CAP_HYPERV_SEND_IPI) > 0) {
> +        entry_recomm->eax |= HV_CLUSTER_IPI_RECOMMENDED;
> +        entry_recomm->eax |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
>      }
> -    if (cpu->hyperv_relaxed_timing) {
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_RELAXED_TIMING_RECOMMENDED;
> +
> +    return cpuid;
> +}
> +
> +static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
> +{
> +    struct kvm_cpuid_entry2 *entry;
> +    uint32_t func;
> +    int reg;
> +
> +    switch (fw) {
> +    case FEAT_HYPERV_EAX:
> +        reg = R_EAX;
> +        func = HV_CPUID_FEATURES;
> +        break;
> +    case FEAT_HYPERV_EDX:
> +        reg = R_EDX;
> +        func = HV_CPUID_FEATURES;
> +        break;
> +    case FEAT_HV_RECOMM_EAX:
> +        reg = R_EAX;
> +        func = HV_CPUID_ENLIGHTMENT_INFO;
> +        break;
> +    default:
> +        return -EINVAL;
>      }
> -    if (cpu->hyperv_vapic) {
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_APIC_ACCESS_RECOMMENDED;
> -    }
> -    if (cpu->hyperv_tlbflush) {
> -        if (kvm_check_extension(cs->kvm_state,
> -                                KVM_CAP_HYPERV_TLBFLUSH) <= 0) {
> -            fprintf(stderr, "Hyper-V TLB flush support "
> -                    "(requested by 'hv-tlbflush' cpu flag) "
> -                    " is not supported by kernel\n");
> -            return -ENOSYS;
> -        }
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_REMOTE_TLB_FLUSH_RECOMMENDED;
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
> +
> +    entry = cpuid_find_entry(cpuid, func, 0);
> +    if (!entry) {
> +        return -ENOENT;
>      }
> -    if (cpu->hyperv_ipi) {
> -        if (kvm_check_extension(cs->kvm_state,
> -                                KVM_CAP_HYPERV_SEND_IPI) <= 0) {
> -            fprintf(stderr, "Hyper-V IPI send support "
> -                    "(requested by 'hv-ipi' cpu flag) "
> -                    " is not supported by kernel\n");
> -            return -ENOSYS;
> +
> +    switch (reg) {
> +    case R_EAX:
> +        *r = entry->eax;
> +        break;
> +    case R_EDX:
> +        *r = entry->edx;
> +        break;
> +    default:
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
> +                                  const char *name, bool flag)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +    uint32_t r, fw, bits;;
> +    int i, j;
> +
> +    if (!flag) {
> +        return 0;
> +    }
> +
> +    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties); i++) {
> +        if (strcmp(kvm_hyperv_properties[i].name, name)) {
> +            continue;
>          }
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_CLUSTER_IPI_RECOMMENDED;
> -        env->features[FEAT_HV_RECOMM_EAX] |= HV_EX_PROCESSOR_MASKS_RECOMMENDED;
> +
> +        for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
> +            fw = kvm_hyperv_properties[i].flags[j].fw;
> +            bits = kvm_hyperv_properties[i].flags[j].bits;
> +
> +            if (!fw) {
> +                continue;
> +            }
> +
> +            if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
> +                fprintf(stderr,
> +                        "Hyper-V %s (requested by '%s' cpu flag) "
> +                        "is not supported by kernel\n",
> +                        kvm_hyperv_properties[i].desc,
> +                        kvm_hyperv_properties[i].name);
> +                return 1;
> +            }
> +
> +            env->features[fw] |= bits;
> +        }
> +
> +        return 0;
>      }
> +
> +    /* the requested feature is undefined in kvm_hyperv_properties */
> +    return 1;
> +}
> +
> +static int hyperv_handle_properties(CPUState *cs)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +    struct kvm_cpuid2 *cpuid;
> +    int r = 0;
> +
>      if (cpu->hyperv_evmcs) {
>          uint16_t evmcs_version;
>  
> @@ -849,7 +1068,45 @@ static int hyperv_handle_properties(CPUState *cs)
>          env->features[FEAT_HV_NESTED_EAX] = evmcs_version;
>      }
>  
> -    return 0;
> +    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_CPUID) > 0) {
> +        cpuid = get_supported_hv_cpuid(cs);
> +    } else {
> +        cpuid = get_supported_hv_cpuid_legacy(cs);
> +    }
> +
> +    /* Features */
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-relaxed",
> +                                cpu->hyperv_relaxed_timing);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vapic", cpu->hyperv_vapic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-time", cpu->hyperv_time);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-frequencies",
> +                                cpu->hyperv_frequencies);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-crash", cpu->hyperv_crash);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reenlightenment",
> +                                cpu->hyperv_reenlightenment);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reset", cpu->hyperv_reset);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vpindex", cpu->hyperv_vpindex);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-runtime", cpu->hyperv_runtime);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-synic", cpu->hyperv_synic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-stimer", cpu->hyperv_stimer);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-tlbflush", cpu->hyperv_tlbflush);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-ipi", cpu->hyperv_ipi);

So this duplicates the link between the properties and their names,
originally established by DEFINE_PROP_BOOL(...).  Not good.

The property names are also duplicated in your kvm_hyperv_properties
array.  Not good either.

I'm wondering if a better solution could be to replace the boolean
properties with bit properties within a single 64bit word, and to have
the bit position of property be also the index into the property
descriptor array.  This should allow to DRY when defining the cpu
properties and then validating for their presence.  Besides this will
allow to easily express their interdependencies: every property will
just get an extra field with the mask of prerequisite properties.

I'll see if I can prototype something along these lines early next week
unless you beat me to it.

Anyway this mess that has accumulated around hyperv-related properties
is in need for a spring cleanup since long; thanks a lot for looking
into it!

Roman.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/8] i386/kvm: document existing Hyper-V enlightenments
@ 2019-04-05 13:19     ` Roman Kagan
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Kagan @ 2019-04-05 13:19 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, Dr . David Alan Gilbert,
	Daniel P . Berrangé

On Fri, Mar 29, 2019 at 03:18:27PM +0100, Vitaly Kuznetsov wrote:
> Currently, there is no doc describing hv-* CPU flags, people are
> encouraged to get the information from Microsoft Hyper-V Top Level
> Functional specification (TLFS). There is, however, a bit of QEMU
> specifics.

This is appreciated a lot, thanks for doing this!

> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  docs/hyperv.txt | 180 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 180 insertions(+)
>  create mode 100644 docs/hyperv.txt
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> new file mode 100644
> index 0000000000..397f2517b8
> --- /dev/null
> +++ b/docs/hyperv.txt
> @@ -0,0 +1,180 @@
> +Hyper-V Enlightenments
> +======================
> +
> +
> +1. Description
> +===============
> +In some cases when implementing a hardware interface in software is slow, KVM
> +implements its own paravirtualized interfaces. This works well for Linux as
> +guest support for such features is added simultaneously with the feature itself.
> +It may, however, be hard-to-impossible to add support for these interfaces to
> +proprietary OSes, namely, Microsoft Windows.
> +
> +KVM on x86 implements Hyper-V Enlightenments for Windows guests. These features
> +make Windows and Hyper-V guests think they're running on top of a Hyper-V
> +compatible hypervisor and use Hyper-V specific features.
> +
> +
> +2. Setup
> +=========
> +No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> +QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> +
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> +
> +Sometimes there are dependencies between enlightenments, QEMU is supposed to
> +check that the supplied configuration is sane.
> +
> +When any set of the Hyper-V enlightenments is enabled, QEMU changes hypervisor
> +identification (CPUID 0x40000000..0x4000000A) to Hyper-V. KVM identification
> +and features are kept in leaves 0x40000100..0x40000101.
> +
> +
> +3. Existing enlightenments
> +===========================
> +
> +3.1. hv-relaxed
> +================
> +This feature tells guest OS to disable watchdog timeouts as it is running on a
> +hypervisor. It is known that some Windows versions will do this even when they
> +see 'hypervisor' CPU flag.
> +
> +3.2. hv-vapic
> +==============
> +Provides so-called VP Assist page MSR to guest allowing it to work with APIC
> +more efficiently. In particular, this enlightenment allows paravirtualized
> +(exit-less) EOI processing.
> +
> +3.3. hv-spinlocks=xxx
> +======================
> +Enables paravirtualized spinlocks. The parameter indicates how many times
> +spinlock acquisition should be attempted before indicating the situation to the
> +hypervisor. A special value 0xffffffff indicates "never to retry".
> +
> +3.4. hv-vpindex
> +================
> +Provides HV_X64_MSR_VP_INDEX (0x40000002) MSR to the guest which has Virtual
> +processor index information. This enlightenment makes sense in conjunction with
> +hv-synic, hv-stimer and other enlightenments which require the guest to know its
> +Virtual Processor indices (e.g. when VP index needs to be passed in a
> +hypercall).
> +
> +3.5. hv-runtime
> +================
> +Provides HV_X64_MSR_VP_RUNTIME (0x40000010) MSR to the guest. The MSR keeps the
> +virtual processor run time in 100ns units. This gives guest operating system an
> +idea of how much time was 'stolen' from it (when the virtual CPU was preempted
> +to perform some other work).
> +
> +3.6. hv-crash
> +==============
> +Provides HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 (0x40000100..0x40000105) and
> +HV_X64_MSR_CRASH_CTL (0x40000105) MSRs to the guest. These MSRs are written to
> +by the guest when it crashes, HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 MSRs
> +contain additional crash information. This information is outputted in QEMU log
> +and through QAPI.
> +Note: unlike under genuine Hyper-V, write to HV_X64_MSR_CRASH_CTL causes guest
> +to shutdown. This effectively blocks crash dump generation by Windows.

Hmm, why?

> +
> +3.7. hv-time
> +=============
> +Enables two Hyper-V-specific clocksources available to the guest: MSR-based
> +Hyper-V clocksource (HV_X64_MSR_TIME_REF_COUNT, 0x40000020) and Reference TSC
> +page (enabled via MSR HV_X64_MSR_REFERENCE_TSC, 0x40000021). Both clocksources
> +are per-guest, Reference TSC page clocksource allows for exit-less time stamp
> +readings. Using this enlightenment leads to significant speedup of all timestamp
> +related operations.
> +
> +3.8. hv-synic
> +==============
> +Enables Hyper-V Synthetic interrupt controller - an extension of a local APIC.
> +When enabled, this enlightenment provides additional communication facilities
> +to the guest: SynIC messages and Events. This is a pre-requisite for
> +implementing VMBus devices (not yet in QEMU). Additionally, this enlightenment
> +is needed to enable Hyper-V synthetic timers. SynIC is controlled through MSRs
> +HV_X64_MSR_SCONTROL..HV_X64_MSR_EOM (0x40000080..0x40000084) and
> +HV_X64_MSR_SINT0..HV_X64_MSR_SINT15 (0x40000090..0x4000009F)
> +
> +Requires: hv-vpindex
> +
> +3.9. hv-stimer
> +===============
> +Enables Hyper-V synthetic timers. There are four synthetic timers per virtual
> +CPU controlled through HV_X64_MSR_STIMER0_CONFIG..HV_X64_MSR_STIMER3_COUNT
> +(0x400000B0..0x400000B7) MSRs. These timers can work either in single-shot or
> +periodic mode. It is known that certain Windows versions revert to using RTC
> +extensively when this enlightenment is not provided; this leads to significant
> +CPU consumption, even when virtual CPU is idle.

I think it'll rather use HPET if available.  I'm also not sure the idle
vCPU scenario was one that motivated for implementing paravirtualized
timers.

> +
> +Requires: hv-vpindex, hv-synic, hv-time
> +
> +3.10. hv-tlbflush
> +==================
> +Enables paravirtualized TLB shoot-down mechanism. On x86 architecture, remote
> +TLB flush procedure requires sending IPIs and waiting for other CPUs to perform
> +local TLB flush. In virtualized environment some virtual CPUs may not even be
> +scheduled at the time of the call and may not require flushing (or, flushing
> +may be postponed until the virtual CPU is scheduled). hv-tlbflush enlightenment
> +implements TLB shoot-down through hypervisor enabling the optimization.
> +
> +Requires: hv-vpindex
> +
> +3.11. hv-ipi
> +=============
> +Enables paravirtualized IPI send mechanism. HvCallSendSyntheticClusterIpi
> +hypercall may target more than 64 virtual CPUs simultaneously, doing the same
> +through APIC requires more than one access (and thus exit to the hypervisor).
> +
> +Requires: hv-vpindex
> +
> +3.12. hv-vendor-id=xxx
> +=======================
> +This changes Hyper-V identification in CPUID 0x40000000.EBX-EDX from the default
> +"Microsoft Hv". The parameter should be no longer than 12 characters. According
> +to the specification, guests shouldn't use this information and it is unknown
> +if there is a Windows version which acts differently.
> +Note: hv-vendor-id is not an enlightenment and thus doesn't enable Hyper-V
> +identification when specified without some other enlightenment.
> +
> +3.13. hv-reset
> +===============
> +Provides HV_X64_MSR_RESET (0x40000003) MSR to the guest allowing it to reset
> +itself by writing to it. Even when this MSR is enabled, it is not a recommended
> +way for Windows to perform system reboot and thus it may not be used.
> +
> +3.14. hv-frequencies
> +============================================
> +Provides HV_X64_MSR_TSC_FREQUENCY (0x40000022) and HV_X64_MSR_APIC_FREQUENCY
> +(0x40000023) allowing the guest to get its TSC/APIC frequencies without doing
> +measurements.
> +
> +3.15 hv-reenlightenment
> +========================
> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
> +enabled, it provides HV_X64_MSR_REENLIGHTENMENT_CONTROL (0x40000106),
> +HV_X64_MSR_TSC_EMULATION_CONTROL (0x40000107)and HV_X64_MSR_TSC_EMULATION_STATUS
> +(0x40000108) MSRs allowing the guest to get notified when TSC frequency changes
> +(only happens on migration) and keep using old frequency (through emulation in
> +the hypervisor) until it is ready to switch to the new one. This, in conjunction
> +with hv-frequencies, allows Hyper-V on KVM to pass stable clocksource (Reference
> +TSC page) to its own guests.
> +
> +Recommended: hv-frequencies
> +
> +3.16. hv-evmcs
> +===============
> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
> +enabled, it provides Enlightened VMCS feature to the guest. The feature
> +implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
> +hypervisors making L2 exits to the hypervisor faster. The feature is Intel-only.
> +Note: some virtualization features (e.g. Posted Interrupts) are disabled when
> +hv-evmcs is enabled. It may make sense to measure your nested workload with and
> +without the feature to find out if enabling it is beneficial.
> +
> +Requires: hv-vapic
> +
> +
> +4. Useful links
> +================
> +Hyper-V Top Level Functional specification and other information:
> +https://github.com/MicrosoftDocs/Virtualization-Documentation
> -- 
> 2.20.1
> 

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/8] i386/kvm: document existing Hyper-V enlightenments
@ 2019-04-05 13:19     ` Roman Kagan
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Kagan @ 2019-04-05 13:19 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Eduardo Habkost, Marcelo Tosatti, Dr . David Alan Gilbert,
	qemu-devel, Paolo Bonzini, Richard Henderson

On Fri, Mar 29, 2019 at 03:18:27PM +0100, Vitaly Kuznetsov wrote:
> Currently, there is no doc describing hv-* CPU flags, people are
> encouraged to get the information from Microsoft Hyper-V Top Level
> Functional specification (TLFS). There is, however, a bit of QEMU
> specifics.

This is appreciated a lot, thanks for doing this!

> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  docs/hyperv.txt | 180 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 180 insertions(+)
>  create mode 100644 docs/hyperv.txt
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> new file mode 100644
> index 0000000000..397f2517b8
> --- /dev/null
> +++ b/docs/hyperv.txt
> @@ -0,0 +1,180 @@
> +Hyper-V Enlightenments
> +======================
> +
> +
> +1. Description
> +===============
> +In some cases when implementing a hardware interface in software is slow, KVM
> +implements its own paravirtualized interfaces. This works well for Linux as
> +guest support for such features is added simultaneously with the feature itself.
> +It may, however, be hard-to-impossible to add support for these interfaces to
> +proprietary OSes, namely, Microsoft Windows.
> +
> +KVM on x86 implements Hyper-V Enlightenments for Windows guests. These features
> +make Windows and Hyper-V guests think they're running on top of a Hyper-V
> +compatible hypervisor and use Hyper-V specific features.
> +
> +
> +2. Setup
> +=========
> +No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> +QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> +
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> +
> +Sometimes there are dependencies between enlightenments, QEMU is supposed to
> +check that the supplied configuration is sane.
> +
> +When any set of the Hyper-V enlightenments is enabled, QEMU changes hypervisor
> +identification (CPUID 0x40000000..0x4000000A) to Hyper-V. KVM identification
> +and features are kept in leaves 0x40000100..0x40000101.
> +
> +
> +3. Existing enlightenments
> +===========================
> +
> +3.1. hv-relaxed
> +================
> +This feature tells guest OS to disable watchdog timeouts as it is running on a
> +hypervisor. It is known that some Windows versions will do this even when they
> +see 'hypervisor' CPU flag.
> +
> +3.2. hv-vapic
> +==============
> +Provides so-called VP Assist page MSR to guest allowing it to work with APIC
> +more efficiently. In particular, this enlightenment allows paravirtualized
> +(exit-less) EOI processing.
> +
> +3.3. hv-spinlocks=xxx
> +======================
> +Enables paravirtualized spinlocks. The parameter indicates how many times
> +spinlock acquisition should be attempted before indicating the situation to the
> +hypervisor. A special value 0xffffffff indicates "never to retry".
> +
> +3.4. hv-vpindex
> +================
> +Provides HV_X64_MSR_VP_INDEX (0x40000002) MSR to the guest which has Virtual
> +processor index information. This enlightenment makes sense in conjunction with
> +hv-synic, hv-stimer and other enlightenments which require the guest to know its
> +Virtual Processor indices (e.g. when VP index needs to be passed in a
> +hypercall).
> +
> +3.5. hv-runtime
> +================
> +Provides HV_X64_MSR_VP_RUNTIME (0x40000010) MSR to the guest. The MSR keeps the
> +virtual processor run time in 100ns units. This gives guest operating system an
> +idea of how much time was 'stolen' from it (when the virtual CPU was preempted
> +to perform some other work).
> +
> +3.6. hv-crash
> +==============
> +Provides HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 (0x40000100..0x40000105) and
> +HV_X64_MSR_CRASH_CTL (0x40000105) MSRs to the guest. These MSRs are written to
> +by the guest when it crashes, HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 MSRs
> +contain additional crash information. This information is outputted in QEMU log
> +and through QAPI.
> +Note: unlike under genuine Hyper-V, write to HV_X64_MSR_CRASH_CTL causes guest
> +to shutdown. This effectively blocks crash dump generation by Windows.

Hmm, why?

> +
> +3.7. hv-time
> +=============
> +Enables two Hyper-V-specific clocksources available to the guest: MSR-based
> +Hyper-V clocksource (HV_X64_MSR_TIME_REF_COUNT, 0x40000020) and Reference TSC
> +page (enabled via MSR HV_X64_MSR_REFERENCE_TSC, 0x40000021). Both clocksources
> +are per-guest, Reference TSC page clocksource allows for exit-less time stamp
> +readings. Using this enlightenment leads to significant speedup of all timestamp
> +related operations.
> +
> +3.8. hv-synic
> +==============
> +Enables Hyper-V Synthetic interrupt controller - an extension of a local APIC.
> +When enabled, this enlightenment provides additional communication facilities
> +to the guest: SynIC messages and Events. This is a pre-requisite for
> +implementing VMBus devices (not yet in QEMU). Additionally, this enlightenment
> +is needed to enable Hyper-V synthetic timers. SynIC is controlled through MSRs
> +HV_X64_MSR_SCONTROL..HV_X64_MSR_EOM (0x40000080..0x40000084) and
> +HV_X64_MSR_SINT0..HV_X64_MSR_SINT15 (0x40000090..0x4000009F)
> +
> +Requires: hv-vpindex
> +
> +3.9. hv-stimer
> +===============
> +Enables Hyper-V synthetic timers. There are four synthetic timers per virtual
> +CPU controlled through HV_X64_MSR_STIMER0_CONFIG..HV_X64_MSR_STIMER3_COUNT
> +(0x400000B0..0x400000B7) MSRs. These timers can work either in single-shot or
> +periodic mode. It is known that certain Windows versions revert to using RTC
> +extensively when this enlightenment is not provided; this leads to significant
> +CPU consumption, even when virtual CPU is idle.

I think it'll rather use HPET if available.  I'm also not sure the idle
vCPU scenario was one that motivated for implementing paravirtualized
timers.

> +
> +Requires: hv-vpindex, hv-synic, hv-time
> +
> +3.10. hv-tlbflush
> +==================
> +Enables paravirtualized TLB shoot-down mechanism. On x86 architecture, remote
> +TLB flush procedure requires sending IPIs and waiting for other CPUs to perform
> +local TLB flush. In virtualized environment some virtual CPUs may not even be
> +scheduled at the time of the call and may not require flushing (or, flushing
> +may be postponed until the virtual CPU is scheduled). hv-tlbflush enlightenment
> +implements TLB shoot-down through hypervisor enabling the optimization.
> +
> +Requires: hv-vpindex
> +
> +3.11. hv-ipi
> +=============
> +Enables paravirtualized IPI send mechanism. HvCallSendSyntheticClusterIpi
> +hypercall may target more than 64 virtual CPUs simultaneously, doing the same
> +through APIC requires more than one access (and thus exit to the hypervisor).
> +
> +Requires: hv-vpindex
> +
> +3.12. hv-vendor-id=xxx
> +=======================
> +This changes Hyper-V identification in CPUID 0x40000000.EBX-EDX from the default
> +"Microsoft Hv". The parameter should be no longer than 12 characters. According
> +to the specification, guests shouldn't use this information and it is unknown
> +if there is a Windows version which acts differently.
> +Note: hv-vendor-id is not an enlightenment and thus doesn't enable Hyper-V
> +identification when specified without some other enlightenment.
> +
> +3.13. hv-reset
> +===============
> +Provides HV_X64_MSR_RESET (0x40000003) MSR to the guest allowing it to reset
> +itself by writing to it. Even when this MSR is enabled, it is not a recommended
> +way for Windows to perform system reboot and thus it may not be used.
> +
> +3.14. hv-frequencies
> +============================================
> +Provides HV_X64_MSR_TSC_FREQUENCY (0x40000022) and HV_X64_MSR_APIC_FREQUENCY
> +(0x40000023) allowing the guest to get its TSC/APIC frequencies without doing
> +measurements.
> +
> +3.15 hv-reenlightenment
> +========================
> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
> +enabled, it provides HV_X64_MSR_REENLIGHTENMENT_CONTROL (0x40000106),
> +HV_X64_MSR_TSC_EMULATION_CONTROL (0x40000107)and HV_X64_MSR_TSC_EMULATION_STATUS
> +(0x40000108) MSRs allowing the guest to get notified when TSC frequency changes
> +(only happens on migration) and keep using old frequency (through emulation in
> +the hypervisor) until it is ready to switch to the new one. This, in conjunction
> +with hv-frequencies, allows Hyper-V on KVM to pass stable clocksource (Reference
> +TSC page) to its own guests.
> +
> +Recommended: hv-frequencies
> +
> +3.16. hv-evmcs
> +===============
> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
> +enabled, it provides Enlightened VMCS feature to the guest. The feature
> +implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
> +hypervisors making L2 exits to the hypervisor faster. The feature is Intel-only.
> +Note: some virtualization features (e.g. Posted Interrupts) are disabled when
> +hv-evmcs is enabled. It may make sense to measure your nested workload with and
> +without the feature to find out if enabling it is beneficial.
> +
> +Requires: hv-vapic
> +
> +
> +4. Useful links
> +================
> +Hyper-V Top Level Functional specification and other information:
> +https://github.com/MicrosoftDocs/Virtualization-Documentation
> -- 
> 2.20.1
> 

Thanks,
Roman.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/8] i386/kvm: document existing Hyper-V enlightenments
@ 2019-04-05 14:29       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 10+ messages in thread
From: Vitaly Kuznetsov @ 2019-04-05 14:29 UTC (permalink / raw)
  To: Roman Kagan
  Cc: qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, Dr . David Alan Gilbert,
	Daniel P . Berrangé

Roman Kagan <rkagan@virtuozzo.com> writes:

> On Fri, Mar 29, 2019 at 03:18:27PM +0100, Vitaly Kuznetsov wrote:
>> Currently, there is no doc describing hv-* CPU flags, people are
>> encouraged to get the information from Microsoft Hyper-V Top Level
>> Functional specification (TLFS). There is, however, a bit of QEMU
>> specifics.
>
> This is appreciated a lot, thanks for doing this!
>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  docs/hyperv.txt | 180 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 180 insertions(+)
>>  create mode 100644 docs/hyperv.txt
>> 
>> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> new file mode 100644
>> index 0000000000..397f2517b8
>> --- /dev/null
>> +++ b/docs/hyperv.txt
>> @@ -0,0 +1,180 @@
>> +Hyper-V Enlightenments
>> +======================
>> +
>> +
>> +1. Description
>> +===============
>> +In some cases when implementing a hardware interface in software is slow, KVM
>> +implements its own paravirtualized interfaces. This works well for Linux as
>> +guest support for such features is added simultaneously with the feature itself.
>> +It may, however, be hard-to-impossible to add support for these interfaces to
>> +proprietary OSes, namely, Microsoft Windows.
>> +
>> +KVM on x86 implements Hyper-V Enlightenments for Windows guests. These features
>> +make Windows and Hyper-V guests think they're running on top of a Hyper-V
>> +compatible hypervisor and use Hyper-V specific features.
>> +
>> +
>> +2. Setup
>> +=========
>> +No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
>> +QEMU, individual enlightenments can be enabled through CPU flags, e.g:
>> +
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
>> +
>> +Sometimes there are dependencies between enlightenments, QEMU is supposed to
>> +check that the supplied configuration is sane.
>> +
>> +When any set of the Hyper-V enlightenments is enabled, QEMU changes hypervisor
>> +identification (CPUID 0x40000000..0x4000000A) to Hyper-V. KVM identification
>> +and features are kept in leaves 0x40000100..0x40000101.
>> +
>> +
>> +3. Existing enlightenments
>> +===========================
>> +
>> +3.1. hv-relaxed
>> +================
>> +This feature tells guest OS to disable watchdog timeouts as it is running on a
>> +hypervisor. It is known that some Windows versions will do this even when they
>> +see 'hypervisor' CPU flag.
>> +
>> +3.2. hv-vapic
>> +==============
>> +Provides so-called VP Assist page MSR to guest allowing it to work with APIC
>> +more efficiently. In particular, this enlightenment allows paravirtualized
>> +(exit-less) EOI processing.
>> +
>> +3.3. hv-spinlocks=xxx
>> +======================
>> +Enables paravirtualized spinlocks. The parameter indicates how many times
>> +spinlock acquisition should be attempted before indicating the situation to the
>> +hypervisor. A special value 0xffffffff indicates "never to retry".
>> +
>> +3.4. hv-vpindex
>> +================
>> +Provides HV_X64_MSR_VP_INDEX (0x40000002) MSR to the guest which has Virtual
>> +processor index information. This enlightenment makes sense in conjunction with
>> +hv-synic, hv-stimer and other enlightenments which require the guest to know its
>> +Virtual Processor indices (e.g. when VP index needs to be passed in a
>> +hypercall).
>> +
>> +3.5. hv-runtime
>> +================
>> +Provides HV_X64_MSR_VP_RUNTIME (0x40000010) MSR to the guest. The MSR keeps the
>> +virtual processor run time in 100ns units. This gives guest operating system an
>> +idea of how much time was 'stolen' from it (when the virtual CPU was preempted
>> +to perform some other work).
>> +
>> +3.6. hv-crash
>> +==============
>> +Provides HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 (0x40000100..0x40000105) and
>> +HV_X64_MSR_CRASH_CTL (0x40000105) MSRs to the guest. These MSRs are written to
>> +by the guest when it crashes, HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 MSRs
>> +contain additional crash information. This information is outputted in QEMU log
>> +and through QAPI.
>> +Note: unlike under genuine Hyper-V, write to HV_X64_MSR_CRASH_CTL causes guest
>> +to shutdown. This effectively blocks crash dump generation by Windows.
>
> Hmm, why?
>

This was written completely out of top of my head but I was under an
impression that writing to HV_X64_MSR_CRASH_CTL causes Qemu to shutdown
the guest and Windows does this before it creates crash dump. Am I
mistaken? I can be)

>> +
>> +3.7. hv-time
>> +=============
>> +Enables two Hyper-V-specific clocksources available to the guest: MSR-based
>> +Hyper-V clocksource (HV_X64_MSR_TIME_REF_COUNT, 0x40000020) and Reference TSC
>> +page (enabled via MSR HV_X64_MSR_REFERENCE_TSC, 0x40000021). Both clocksources
>> +are per-guest, Reference TSC page clocksource allows for exit-less time stamp
>> +readings. Using this enlightenment leads to significant speedup of all timestamp
>> +related operations.
>> +
>> +3.8. hv-synic
>> +==============
>> +Enables Hyper-V Synthetic interrupt controller - an extension of a local APIC.
>> +When enabled, this enlightenment provides additional communication facilities
>> +to the guest: SynIC messages and Events. This is a pre-requisite for
>> +implementing VMBus devices (not yet in QEMU). Additionally, this enlightenment
>> +is needed to enable Hyper-V synthetic timers. SynIC is controlled through MSRs
>> +HV_X64_MSR_SCONTROL..HV_X64_MSR_EOM (0x40000080..0x40000084) and
>> +HV_X64_MSR_SINT0..HV_X64_MSR_SINT15 (0x40000090..0x4000009F)
>> +
>> +Requires: hv-vpindex
>> +
>> +3.9. hv-stimer
>> +===============
>> +Enables Hyper-V synthetic timers. There are four synthetic timers per virtual
>> +CPU controlled through HV_X64_MSR_STIMER0_CONFIG..HV_X64_MSR_STIMER3_COUNT
>> +(0x400000B0..0x400000B7) MSRs. These timers can work either in single-shot or
>> +periodic mode. It is known that certain Windows versions revert to using RTC
>> +extensively when this enlightenment is not provided; this leads to significant
>> +CPU consumption, even when virtual CPU is idle.
>
> I think it'll rather use HPET if available.  I'm also not sure the idle
> vCPU scenario was one that motivated for implementing paravirtualized
> timers.

Right but there was an "improvement" from MS in one of their Win10
updates which made things significantly worse so I thought it would be a
good idea to document this user-visible property.

>
>> +
>> +Requires: hv-vpindex, hv-synic, hv-time
>> +
>> +3.10. hv-tlbflush
>> +==================
>> +Enables paravirtualized TLB shoot-down mechanism. On x86 architecture, remote
>> +TLB flush procedure requires sending IPIs and waiting for other CPUs to perform
>> +local TLB flush. In virtualized environment some virtual CPUs may not even be
>> +scheduled at the time of the call and may not require flushing (or, flushing
>> +may be postponed until the virtual CPU is scheduled). hv-tlbflush enlightenment
>> +implements TLB shoot-down through hypervisor enabling the optimization.
>> +
>> +Requires: hv-vpindex
>> +
>> +3.11. hv-ipi
>> +=============
>> +Enables paravirtualized IPI send mechanism. HvCallSendSyntheticClusterIpi
>> +hypercall may target more than 64 virtual CPUs simultaneously, doing the same
>> +through APIC requires more than one access (and thus exit to the hypervisor).
>> +
>> +Requires: hv-vpindex
>> +
>> +3.12. hv-vendor-id=xxx
>> +=======================
>> +This changes Hyper-V identification in CPUID 0x40000000.EBX-EDX from the default
>> +"Microsoft Hv". The parameter should be no longer than 12 characters. According
>> +to the specification, guests shouldn't use this information and it is unknown
>> +if there is a Windows version which acts differently.
>> +Note: hv-vendor-id is not an enlightenment and thus doesn't enable Hyper-V
>> +identification when specified without some other enlightenment.
>> +
>> +3.13. hv-reset
>> +===============
>> +Provides HV_X64_MSR_RESET (0x40000003) MSR to the guest allowing it to reset
>> +itself by writing to it. Even when this MSR is enabled, it is not a recommended
>> +way for Windows to perform system reboot and thus it may not be used.
>> +
>> +3.14. hv-frequencies
>> +============================================
>> +Provides HV_X64_MSR_TSC_FREQUENCY (0x40000022) and HV_X64_MSR_APIC_FREQUENCY
>> +(0x40000023) allowing the guest to get its TSC/APIC frequencies without doing
>> +measurements.
>> +
>> +3.15 hv-reenlightenment
>> +========================
>> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
>> +enabled, it provides HV_X64_MSR_REENLIGHTENMENT_CONTROL (0x40000106),
>> +HV_X64_MSR_TSC_EMULATION_CONTROL (0x40000107)and HV_X64_MSR_TSC_EMULATION_STATUS
>> +(0x40000108) MSRs allowing the guest to get notified when TSC frequency changes
>> +(only happens on migration) and keep using old frequency (through emulation in
>> +the hypervisor) until it is ready to switch to the new one. This, in conjunction
>> +with hv-frequencies, allows Hyper-V on KVM to pass stable clocksource (Reference
>> +TSC page) to its own guests.
>> +
>> +Recommended: hv-frequencies
>> +
>> +3.16. hv-evmcs
>> +===============
>> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
>> +enabled, it provides Enlightened VMCS feature to the guest. The feature
>> +implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
>> +hypervisors making L2 exits to the hypervisor faster. The feature is Intel-only.
>> +Note: some virtualization features (e.g. Posted Interrupts) are disabled when
>> +hv-evmcs is enabled. It may make sense to measure your nested workload with and
>> +without the feature to find out if enabling it is beneficial.
>> +
>> +Requires: hv-vapic
>> +
>> +
>> +4. Useful links
>> +================
>> +Hyper-V Top Level Functional specification and other information:
>> +https://github.com/MicrosoftDocs/Virtualization-Documentation
>> -- 
>> 2.20.1
>> 
>
> Thanks,
> Roman.

-- 
Vitaly

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 3/8] i386/kvm: document existing Hyper-V enlightenments
@ 2019-04-05 14:29       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 10+ messages in thread
From: Vitaly Kuznetsov @ 2019-04-05 14:29 UTC (permalink / raw)
  To: Roman Kagan
  Cc: Eduardo Habkost, Marcelo Tosatti, Dr . David Alan Gilbert,
	qemu-devel, Paolo Bonzini, Richard Henderson

Roman Kagan <rkagan@virtuozzo.com> writes:

> On Fri, Mar 29, 2019 at 03:18:27PM +0100, Vitaly Kuznetsov wrote:
>> Currently, there is no doc describing hv-* CPU flags, people are
>> encouraged to get the information from Microsoft Hyper-V Top Level
>> Functional specification (TLFS). There is, however, a bit of QEMU
>> specifics.
>
> This is appreciated a lot, thanks for doing this!
>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  docs/hyperv.txt | 180 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 180 insertions(+)
>>  create mode 100644 docs/hyperv.txt
>> 
>> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> new file mode 100644
>> index 0000000000..397f2517b8
>> --- /dev/null
>> +++ b/docs/hyperv.txt
>> @@ -0,0 +1,180 @@
>> +Hyper-V Enlightenments
>> +======================
>> +
>> +
>> +1. Description
>> +===============
>> +In some cases when implementing a hardware interface in software is slow, KVM
>> +implements its own paravirtualized interfaces. This works well for Linux as
>> +guest support for such features is added simultaneously with the feature itself.
>> +It may, however, be hard-to-impossible to add support for these interfaces to
>> +proprietary OSes, namely, Microsoft Windows.
>> +
>> +KVM on x86 implements Hyper-V Enlightenments for Windows guests. These features
>> +make Windows and Hyper-V guests think they're running on top of a Hyper-V
>> +compatible hypervisor and use Hyper-V specific features.
>> +
>> +
>> +2. Setup
>> +=========
>> +No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
>> +QEMU, individual enlightenments can be enabled through CPU flags, e.g:
>> +
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
>> +
>> +Sometimes there are dependencies between enlightenments, QEMU is supposed to
>> +check that the supplied configuration is sane.
>> +
>> +When any set of the Hyper-V enlightenments is enabled, QEMU changes hypervisor
>> +identification (CPUID 0x40000000..0x4000000A) to Hyper-V. KVM identification
>> +and features are kept in leaves 0x40000100..0x40000101.
>> +
>> +
>> +3. Existing enlightenments
>> +===========================
>> +
>> +3.1. hv-relaxed
>> +================
>> +This feature tells guest OS to disable watchdog timeouts as it is running on a
>> +hypervisor. It is known that some Windows versions will do this even when they
>> +see 'hypervisor' CPU flag.
>> +
>> +3.2. hv-vapic
>> +==============
>> +Provides so-called VP Assist page MSR to guest allowing it to work with APIC
>> +more efficiently. In particular, this enlightenment allows paravirtualized
>> +(exit-less) EOI processing.
>> +
>> +3.3. hv-spinlocks=xxx
>> +======================
>> +Enables paravirtualized spinlocks. The parameter indicates how many times
>> +spinlock acquisition should be attempted before indicating the situation to the
>> +hypervisor. A special value 0xffffffff indicates "never to retry".
>> +
>> +3.4. hv-vpindex
>> +================
>> +Provides HV_X64_MSR_VP_INDEX (0x40000002) MSR to the guest which has Virtual
>> +processor index information. This enlightenment makes sense in conjunction with
>> +hv-synic, hv-stimer and other enlightenments which require the guest to know its
>> +Virtual Processor indices (e.g. when VP index needs to be passed in a
>> +hypercall).
>> +
>> +3.5. hv-runtime
>> +================
>> +Provides HV_X64_MSR_VP_RUNTIME (0x40000010) MSR to the guest. The MSR keeps the
>> +virtual processor run time in 100ns units. This gives guest operating system an
>> +idea of how much time was 'stolen' from it (when the virtual CPU was preempted
>> +to perform some other work).
>> +
>> +3.6. hv-crash
>> +==============
>> +Provides HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 (0x40000100..0x40000105) and
>> +HV_X64_MSR_CRASH_CTL (0x40000105) MSRs to the guest. These MSRs are written to
>> +by the guest when it crashes, HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 MSRs
>> +contain additional crash information. This information is outputted in QEMU log
>> +and through QAPI.
>> +Note: unlike under genuine Hyper-V, write to HV_X64_MSR_CRASH_CTL causes guest
>> +to shutdown. This effectively blocks crash dump generation by Windows.
>
> Hmm, why?
>

This was written completely out of top of my head but I was under an
impression that writing to HV_X64_MSR_CRASH_CTL causes Qemu to shutdown
the guest and Windows does this before it creates crash dump. Am I
mistaken? I can be)

>> +
>> +3.7. hv-time
>> +=============
>> +Enables two Hyper-V-specific clocksources available to the guest: MSR-based
>> +Hyper-V clocksource (HV_X64_MSR_TIME_REF_COUNT, 0x40000020) and Reference TSC
>> +page (enabled via MSR HV_X64_MSR_REFERENCE_TSC, 0x40000021). Both clocksources
>> +are per-guest, Reference TSC page clocksource allows for exit-less time stamp
>> +readings. Using this enlightenment leads to significant speedup of all timestamp
>> +related operations.
>> +
>> +3.8. hv-synic
>> +==============
>> +Enables Hyper-V Synthetic interrupt controller - an extension of a local APIC.
>> +When enabled, this enlightenment provides additional communication facilities
>> +to the guest: SynIC messages and Events. This is a pre-requisite for
>> +implementing VMBus devices (not yet in QEMU). Additionally, this enlightenment
>> +is needed to enable Hyper-V synthetic timers. SynIC is controlled through MSRs
>> +HV_X64_MSR_SCONTROL..HV_X64_MSR_EOM (0x40000080..0x40000084) and
>> +HV_X64_MSR_SINT0..HV_X64_MSR_SINT15 (0x40000090..0x4000009F)
>> +
>> +Requires: hv-vpindex
>> +
>> +3.9. hv-stimer
>> +===============
>> +Enables Hyper-V synthetic timers. There are four synthetic timers per virtual
>> +CPU controlled through HV_X64_MSR_STIMER0_CONFIG..HV_X64_MSR_STIMER3_COUNT
>> +(0x400000B0..0x400000B7) MSRs. These timers can work either in single-shot or
>> +periodic mode. It is known that certain Windows versions revert to using RTC
>> +extensively when this enlightenment is not provided; this leads to significant
>> +CPU consumption, even when virtual CPU is idle.
>
> I think it'll rather use HPET if available.  I'm also not sure the idle
> vCPU scenario was one that motivated for implementing paravirtualized
> timers.

Right but there was an "improvement" from MS in one of their Win10
updates which made things significantly worse so I thought it would be a
good idea to document this user-visible property.

>
>> +
>> +Requires: hv-vpindex, hv-synic, hv-time
>> +
>> +3.10. hv-tlbflush
>> +==================
>> +Enables paravirtualized TLB shoot-down mechanism. On x86 architecture, remote
>> +TLB flush procedure requires sending IPIs and waiting for other CPUs to perform
>> +local TLB flush. In virtualized environment some virtual CPUs may not even be
>> +scheduled at the time of the call and may not require flushing (or, flushing
>> +may be postponed until the virtual CPU is scheduled). hv-tlbflush enlightenment
>> +implements TLB shoot-down through hypervisor enabling the optimization.
>> +
>> +Requires: hv-vpindex
>> +
>> +3.11. hv-ipi
>> +=============
>> +Enables paravirtualized IPI send mechanism. HvCallSendSyntheticClusterIpi
>> +hypercall may target more than 64 virtual CPUs simultaneously, doing the same
>> +through APIC requires more than one access (and thus exit to the hypervisor).
>> +
>> +Requires: hv-vpindex
>> +
>> +3.12. hv-vendor-id=xxx
>> +=======================
>> +This changes Hyper-V identification in CPUID 0x40000000.EBX-EDX from the default
>> +"Microsoft Hv". The parameter should be no longer than 12 characters. According
>> +to the specification, guests shouldn't use this information and it is unknown
>> +if there is a Windows version which acts differently.
>> +Note: hv-vendor-id is not an enlightenment and thus doesn't enable Hyper-V
>> +identification when specified without some other enlightenment.
>> +
>> +3.13. hv-reset
>> +===============
>> +Provides HV_X64_MSR_RESET (0x40000003) MSR to the guest allowing it to reset
>> +itself by writing to it. Even when this MSR is enabled, it is not a recommended
>> +way for Windows to perform system reboot and thus it may not be used.
>> +
>> +3.14. hv-frequencies
>> +============================================
>> +Provides HV_X64_MSR_TSC_FREQUENCY (0x40000022) and HV_X64_MSR_APIC_FREQUENCY
>> +(0x40000023) allowing the guest to get its TSC/APIC frequencies without doing
>> +measurements.
>> +
>> +3.15 hv-reenlightenment
>> +========================
>> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
>> +enabled, it provides HV_X64_MSR_REENLIGHTENMENT_CONTROL (0x40000106),
>> +HV_X64_MSR_TSC_EMULATION_CONTROL (0x40000107)and HV_X64_MSR_TSC_EMULATION_STATUS
>> +(0x40000108) MSRs allowing the guest to get notified when TSC frequency changes
>> +(only happens on migration) and keep using old frequency (through emulation in
>> +the hypervisor) until it is ready to switch to the new one. This, in conjunction
>> +with hv-frequencies, allows Hyper-V on KVM to pass stable clocksource (Reference
>> +TSC page) to its own guests.
>> +
>> +Recommended: hv-frequencies
>> +
>> +3.16. hv-evmcs
>> +===============
>> +The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
>> +enabled, it provides Enlightened VMCS feature to the guest. The feature
>> +implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
>> +hypervisors making L2 exits to the hypervisor faster. The feature is Intel-only.
>> +Note: some virtualization features (e.g. Posted Interrupts) are disabled when
>> +hv-evmcs is enabled. It may make sense to measure your nested workload with and
>> +without the feature to find out if enabling it is beneficial.
>> +
>> +Requires: hv-vapic
>> +
>> +
>> +4. Useful links
>> +================
>> +Hyper-V Top Level Functional specification and other information:
>> +https://github.com/MicrosoftDocs/Virtualization-Documentation
>> -- 
>> 2.20.1
>> 
>
> Thanks,
> Roman.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] i386/kvm: implement 'hv-all' pass-through mode
@ 2019-04-05 15:07     ` Roman Kagan
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Kagan @ 2019-04-05 15:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, Dr . David Alan Gilbert,
	Daniel P . Berrangé

On Fri, Mar 29, 2019 at 03:18:28PM +0100, Vitaly Kuznetsov wrote:
> In many case we just want to give Windows guests all currently supported
> Hyper-V enlightenments and that's where this new mode may come handy. We
> pass through what was returned by KVM_GET_SUPPORTED_HV_CPUID.

The only one out of those "many cases" I can think of is when you've
developed a new hyperv feature in the kernel and you want to test it
with a version of QEMU that's not aware of it.  Are there any others?

> 
> hv_cpuid_check_and_set() is modified to also set cpu->hyperv_* flags as
> we may want to check them later (and we actually do for hv_runtime,
> hv_synic,...).
> 
> 'hv-all' is a development only feature, a migration blocker is added to
> prevent issues while migrating between hosts with different feature sets.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  docs/hyperv.txt   |  10 ++++
>  target/i386/cpu.c |   1 +
>  target/i386/cpu.h |   1 +
>  target/i386/kvm.c | 148 +++++++++++++++++++++++++++++++++++++---------
>  4 files changed, 132 insertions(+), 28 deletions(-)
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> index 397f2517b8..d1299aba81 100644
> --- a/docs/hyperv.txt
> +++ b/docs/hyperv.txt
> @@ -174,6 +174,16 @@ without the feature to find out if enabling it is beneficial.
>  Requires: hv-vapic
>  
>  
> +4. Development features
> +========================
> +In some cases (e.g. during development) it may make sense to use QEMU in
> +'pass-through' mode and give Windows guests all enlightenments currently
> +supported by KVM. This pass-through mode is enabled by "hv-all" CPU flag.
> +Note: enabling this flag effectively prevents migration as supported features
> +may differ between target and destination.

I find 'hv-passthrough' a more adequate name for this.

> +Note: "hv-all" doesn't include 'hv-evmcs', it needs to be enabled explicitly.

This is extremely confusing, when some features are more equal than
others.  I think it'd make more sense instead to support filtering out
some features, like in "hv-passthrough,hv-evmcs=off".

Thanks,
Roman.

> +
> +
>  4. Useful links
>  ================
>  Hyper-V Top Level Functional specification and other information:
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index d6bb57d210..4e01ad076e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5785,6 +5785,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("hv-tlbflush", X86CPU, hyperv_tlbflush, false),
>      DEFINE_PROP_BOOL("hv-evmcs", X86CPU, hyperv_evmcs, false),
>      DEFINE_PROP_BOOL("hv-ipi", X86CPU, hyperv_ipi, false),
> +    DEFINE_PROP_BOOL("hv-all", X86CPU, hyperv_all, false),
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 83fb522554..9cd3a8bc2f 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1396,6 +1396,7 @@ struct X86CPU {
>      bool hyperv_tlbflush;
>      bool hyperv_evmcs;
>      bool hyperv_ipi;
> +    bool hyperv_all;
>      bool check_cpuid;
>      bool enforce_cpuid;
>      bool expose_kvm;
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 63031358ae..af45241adb 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -656,7 +656,8 @@ static bool hyperv_enabled(X86CPU *cpu)
>              cpu->hyperv_stimer ||
>              cpu->hyperv_reenlightenment ||
>              cpu->hyperv_tlbflush ||
> -            cpu->hyperv_ipi);
> +            cpu->hyperv_ipi ||
> +            cpu->hyperv_all);
>  }
>  
>  static int kvm_arch_set_tsc_khz(CPUState *cs)
> @@ -1004,14 +1005,15 @@ static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
>  }
>  
>  static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
> -                                  const char *name, bool flag)
> +                                  const char *name, bool *flag)
>  {
>      X86CPU *cpu = X86_CPU(cs);
>      CPUX86State *env = &cpu->env;
>      uint32_t r, fw, bits;;
>      int i, j;
> +    bool present;
>  
> -    if (!flag) {
> +    if (!*flag && !cpu->hyperv_all) {
>          return 0;
>      }
>  
> @@ -1020,6 +1022,7 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
>              continue;
>          }
>  
> +        present = true;
>          for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
>              fw = kvm_hyperv_properties[i].flags[j].fw;
>              bits = kvm_hyperv_properties[i].flags[j].bits;
> @@ -1029,17 +1032,26 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
>              }
>  
>              if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
> -                fprintf(stderr,
> -                        "Hyper-V %s (requested by '%s' cpu flag) "
> -                        "is not supported by kernel\n",
> -                        kvm_hyperv_properties[i].desc,
> -                        kvm_hyperv_properties[i].name);
> -                return 1;
> +                if (*flag) {
> +                    fprintf(stderr,
> +                            "Hyper-V %s (requested by '%s' cpu flag) "
> +                            "is not supported by kernel\n",
> +                            kvm_hyperv_properties[i].desc,
> +                            kvm_hyperv_properties[i].name);
> +                    return 1;
> +                } else {
> +                    present = false;
> +                    break;
> +                }
>              }
>  
>              env->features[fw] |= bits;
>          }
>  
> +        if (cpu->hyperv_all && present) {
> +            *flag = true;
> +        }
> +
>          return 0;
>      }
>  
> @@ -1047,6 +1059,43 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
>      return 1;
>  }
>  
> +static int hv_report_missing_dep(X86CPU *cpu, const char *name,
> +                                 const char *dep_name)
> +{
> +    int i, j, nprops = sizeof(kvm_hyperv_properties);
> +
> +    for (i = 0; i < nprops; i++) {
> +        if (!strcmp(kvm_hyperv_properties[i].name, name)) {
> +            break;
> +        }
> +    }
> +    for (j = 0; j < nprops; j++) {
> +        if (!strcmp(kvm_hyperv_properties[j].name, dep_name)) {
> +            break;
> +        }
> +    }
> +
> +    /*
> +     * Internal error: either feature or its dependency is not in
> +     * kvm_hyperv_properties!
> +     */
> +    if (i == nprops || j == nprops) {
> +        return 1;
> +    }
> +
> +    if (cpu->hyperv_all) {
> +        fprintf(stderr, "Hyper-V %s (requested by 'hv-all' cpu flag) "
> +                "requires %s (is not supported by kernel)\n",
> +                kvm_hyperv_properties[i].desc, kvm_hyperv_properties[j].desc);
> +    } else {
> +        fprintf(stderr, "Hyper-V %s (requested by '%s' cpu flag) "
> +                "requires %s ('%s')\n", kvm_hyperv_properties[i].desc,
> +                name, kvm_hyperv_properties[j].desc, dep_name);
> +    }
> +
> +    return 1;
> +}
> +
>  /*
>   * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent in
>   * case of success, errno < 0 in case of failure and 0 when no Hyper-V
> @@ -1086,32 +1135,54 @@ static int hyperv_handle_properties(CPUState *cs,
>          cpuid = get_supported_hv_cpuid_legacy(cs);
>      }
>  
> +    if (cpu->hyperv_all) {
> +        memcpy(cpuid_ent, &cpuid->entries[0],
> +               cpuid->nent * sizeof(cpuid->entries[0]));
> +
> +        c = cpuid_find_entry(cpuid, HV_CPUID_FEATURES, 0);
> +        if (c) {
> +            env->features[FEAT_HYPERV_EAX] = c->eax;
> +            env->features[FEAT_HYPERV_EBX] = c->ebx;
> +            env->features[FEAT_HYPERV_EDX] = c->eax;
> +        }
> +        c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
> +        if (c) {
> +            env->features[FEAT_HV_RECOMM_EAX] = c->eax;
> +
> +            /* hv-spinlocks may have been overriden */
> +            if (cpu->hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_RETRY) {
> +                c->ebx = cpu->hyperv_spinlock_attempts;
> +            }
> +        }
> +        c = cpuid_find_entry(cpuid, HV_CPUID_NESTED_FEATURES, 0);
> +        if (c) {
> +            env->features[FEAT_HV_NESTED_EAX] = c->eax;
> +        }
> +    }
> +
>      /* Features */
>      r |= hv_cpuid_check_and_set(cs, cpuid, "hv-relaxed",
> -                                cpu->hyperv_relaxed_timing);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vapic", cpu->hyperv_vapic);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-time", cpu->hyperv_time);
> +                                &cpu->hyperv_relaxed_timing);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vapic", &cpu->hyperv_vapic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-time", &cpu->hyperv_time);
>      r |= hv_cpuid_check_and_set(cs, cpuid, "hv-frequencies",
> -                                cpu->hyperv_frequencies);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-crash", cpu->hyperv_crash);
> +                                &cpu->hyperv_frequencies);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-crash", &cpu->hyperv_crash);
>      r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reenlightenment",
> -                                cpu->hyperv_reenlightenment);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reset", cpu->hyperv_reset);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vpindex", cpu->hyperv_vpindex);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-runtime", cpu->hyperv_runtime);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-synic", cpu->hyperv_synic);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-stimer", cpu->hyperv_stimer);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-tlbflush", cpu->hyperv_tlbflush);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-ipi", cpu->hyperv_ipi);
> +                                &cpu->hyperv_reenlightenment);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reset", &cpu->hyperv_reset);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vpindex", &cpu->hyperv_vpindex);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-runtime", &cpu->hyperv_runtime);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-synic", &cpu->hyperv_synic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-stimer", &cpu->hyperv_stimer);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-tlbflush",
> +                                &cpu->hyperv_tlbflush);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-ipi", &cpu->hyperv_ipi);
>  
>      /* Dependencies */
>      if (cpu->hyperv_synic && !cpu->hyperv_synic_kvm_only &&
> -        !cpu->hyperv_vpindex) {
> -        fprintf(stderr, "Hyper-V SynIC "
> -                "(requested by 'hv-synic' cpu flag) "
> -                "requires Hyper-V VP_INDEX ('hv-vpindex')\n");
> -        r |= 1;
> -    }
> +        !cpu->hyperv_vpindex)
> +        r |= hv_report_missing_dep(cpu, "hv-synic", "hv-vpindex");
>  
>      /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
>      env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
> @@ -1121,6 +1192,12 @@ static int hyperv_handle_properties(CPUState *cs,
>          goto free;
>      }
>  
> +    if (cpu->hyperv_all) {
> +        /* We already copied all feature words from KVM as is */
> +        r = cpuid->nent;
> +        goto free;
> +    }
> +
>      c = &cpuid_ent[cpuid_i++];
>      c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
>      if (!cpu->hyperv_vendor_id) {
> @@ -1192,11 +1269,26 @@ free:
>      return r;
>  }
>  
> +static Error *hv_all_mig_blocker;
> +
>  static int hyperv_init_vcpu(X86CPU *cpu)
>  {
>      CPUState *cs = CPU(cpu);
> +    Error *local_err = NULL;
>      int ret;
>  
> +    if (cpu->hyperv_all && hv_all_mig_blocker == NULL) {
> +        error_setg(&hv_all_mig_blocker,
> +                   "'hv-all' CPU flag prevents migration, use explicit set of "
> +                   "hv-* flags instead");
> +        ret = migrate_add_blocker(hv_all_mig_blocker, &local_err);
> +        if (local_err) {
> +            error_report_err(local_err);
> +            error_free(hv_all_mig_blocker);
> +            return ret;
> +        }
> +    }
> +
>      if (cpu->hyperv_vpindex && !hv_vpindex_settable) {
>          /*
>           * the kernel doesn't support setting vp_index; assert that its value

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] i386/kvm: implement 'hv-all' pass-through mode
@ 2019-04-05 15:07     ` Roman Kagan
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Kagan @ 2019-04-05 15:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Eduardo Habkost, Marcelo Tosatti, Dr . David Alan Gilbert,
	qemu-devel, Paolo Bonzini, Richard Henderson

On Fri, Mar 29, 2019 at 03:18:28PM +0100, Vitaly Kuznetsov wrote:
> In many case we just want to give Windows guests all currently supported
> Hyper-V enlightenments and that's where this new mode may come handy. We
> pass through what was returned by KVM_GET_SUPPORTED_HV_CPUID.

The only one out of those "many cases" I can think of is when you've
developed a new hyperv feature in the kernel and you want to test it
with a version of QEMU that's not aware of it.  Are there any others?

> 
> hv_cpuid_check_and_set() is modified to also set cpu->hyperv_* flags as
> we may want to check them later (and we actually do for hv_runtime,
> hv_synic,...).
> 
> 'hv-all' is a development only feature, a migration blocker is added to
> prevent issues while migrating between hosts with different feature sets.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  docs/hyperv.txt   |  10 ++++
>  target/i386/cpu.c |   1 +
>  target/i386/cpu.h |   1 +
>  target/i386/kvm.c | 148 +++++++++++++++++++++++++++++++++++++---------
>  4 files changed, 132 insertions(+), 28 deletions(-)
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> index 397f2517b8..d1299aba81 100644
> --- a/docs/hyperv.txt
> +++ b/docs/hyperv.txt
> @@ -174,6 +174,16 @@ without the feature to find out if enabling it is beneficial.
>  Requires: hv-vapic
>  
>  
> +4. Development features
> +========================
> +In some cases (e.g. during development) it may make sense to use QEMU in
> +'pass-through' mode and give Windows guests all enlightenments currently
> +supported by KVM. This pass-through mode is enabled by "hv-all" CPU flag.
> +Note: enabling this flag effectively prevents migration as supported features
> +may differ between target and destination.

I find 'hv-passthrough' a more adequate name for this.

> +Note: "hv-all" doesn't include 'hv-evmcs', it needs to be enabled explicitly.

This is extremely confusing, when some features are more equal than
others.  I think it'd make more sense instead to support filtering out
some features, like in "hv-passthrough,hv-evmcs=off".

Thanks,
Roman.

> +
> +
>  4. Useful links
>  ================
>  Hyper-V Top Level Functional specification and other information:
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index d6bb57d210..4e01ad076e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5785,6 +5785,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("hv-tlbflush", X86CPU, hyperv_tlbflush, false),
>      DEFINE_PROP_BOOL("hv-evmcs", X86CPU, hyperv_evmcs, false),
>      DEFINE_PROP_BOOL("hv-ipi", X86CPU, hyperv_ipi, false),
> +    DEFINE_PROP_BOOL("hv-all", X86CPU, hyperv_all, false),
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
>      DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 83fb522554..9cd3a8bc2f 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1396,6 +1396,7 @@ struct X86CPU {
>      bool hyperv_tlbflush;
>      bool hyperv_evmcs;
>      bool hyperv_ipi;
> +    bool hyperv_all;
>      bool check_cpuid;
>      bool enforce_cpuid;
>      bool expose_kvm;
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 63031358ae..af45241adb 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -656,7 +656,8 @@ static bool hyperv_enabled(X86CPU *cpu)
>              cpu->hyperv_stimer ||
>              cpu->hyperv_reenlightenment ||
>              cpu->hyperv_tlbflush ||
> -            cpu->hyperv_ipi);
> +            cpu->hyperv_ipi ||
> +            cpu->hyperv_all);
>  }
>  
>  static int kvm_arch_set_tsc_khz(CPUState *cs)
> @@ -1004,14 +1005,15 @@ static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
>  }
>  
>  static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
> -                                  const char *name, bool flag)
> +                                  const char *name, bool *flag)
>  {
>      X86CPU *cpu = X86_CPU(cs);
>      CPUX86State *env = &cpu->env;
>      uint32_t r, fw, bits;;
>      int i, j;
> +    bool present;
>  
> -    if (!flag) {
> +    if (!*flag && !cpu->hyperv_all) {
>          return 0;
>      }
>  
> @@ -1020,6 +1022,7 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
>              continue;
>          }
>  
> +        present = true;
>          for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
>              fw = kvm_hyperv_properties[i].flags[j].fw;
>              bits = kvm_hyperv_properties[i].flags[j].bits;
> @@ -1029,17 +1032,26 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
>              }
>  
>              if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
> -                fprintf(stderr,
> -                        "Hyper-V %s (requested by '%s' cpu flag) "
> -                        "is not supported by kernel\n",
> -                        kvm_hyperv_properties[i].desc,
> -                        kvm_hyperv_properties[i].name);
> -                return 1;
> +                if (*flag) {
> +                    fprintf(stderr,
> +                            "Hyper-V %s (requested by '%s' cpu flag) "
> +                            "is not supported by kernel\n",
> +                            kvm_hyperv_properties[i].desc,
> +                            kvm_hyperv_properties[i].name);
> +                    return 1;
> +                } else {
> +                    present = false;
> +                    break;
> +                }
>              }
>  
>              env->features[fw] |= bits;
>          }
>  
> +        if (cpu->hyperv_all && present) {
> +            *flag = true;
> +        }
> +
>          return 0;
>      }
>  
> @@ -1047,6 +1059,43 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
>      return 1;
>  }
>  
> +static int hv_report_missing_dep(X86CPU *cpu, const char *name,
> +                                 const char *dep_name)
> +{
> +    int i, j, nprops = sizeof(kvm_hyperv_properties);
> +
> +    for (i = 0; i < nprops; i++) {
> +        if (!strcmp(kvm_hyperv_properties[i].name, name)) {
> +            break;
> +        }
> +    }
> +    for (j = 0; j < nprops; j++) {
> +        if (!strcmp(kvm_hyperv_properties[j].name, dep_name)) {
> +            break;
> +        }
> +    }
> +
> +    /*
> +     * Internal error: either feature or its dependency is not in
> +     * kvm_hyperv_properties!
> +     */
> +    if (i == nprops || j == nprops) {
> +        return 1;
> +    }
> +
> +    if (cpu->hyperv_all) {
> +        fprintf(stderr, "Hyper-V %s (requested by 'hv-all' cpu flag) "
> +                "requires %s (is not supported by kernel)\n",
> +                kvm_hyperv_properties[i].desc, kvm_hyperv_properties[j].desc);
> +    } else {
> +        fprintf(stderr, "Hyper-V %s (requested by '%s' cpu flag) "
> +                "requires %s ('%s')\n", kvm_hyperv_properties[i].desc,
> +                name, kvm_hyperv_properties[j].desc, dep_name);
> +    }
> +
> +    return 1;
> +}
> +
>  /*
>   * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent in
>   * case of success, errno < 0 in case of failure and 0 when no Hyper-V
> @@ -1086,32 +1135,54 @@ static int hyperv_handle_properties(CPUState *cs,
>          cpuid = get_supported_hv_cpuid_legacy(cs);
>      }
>  
> +    if (cpu->hyperv_all) {
> +        memcpy(cpuid_ent, &cpuid->entries[0],
> +               cpuid->nent * sizeof(cpuid->entries[0]));
> +
> +        c = cpuid_find_entry(cpuid, HV_CPUID_FEATURES, 0);
> +        if (c) {
> +            env->features[FEAT_HYPERV_EAX] = c->eax;
> +            env->features[FEAT_HYPERV_EBX] = c->ebx;
> +            env->features[FEAT_HYPERV_EDX] = c->eax;
> +        }
> +        c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
> +        if (c) {
> +            env->features[FEAT_HV_RECOMM_EAX] = c->eax;
> +
> +            /* hv-spinlocks may have been overriden */
> +            if (cpu->hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_RETRY) {
> +                c->ebx = cpu->hyperv_spinlock_attempts;
> +            }
> +        }
> +        c = cpuid_find_entry(cpuid, HV_CPUID_NESTED_FEATURES, 0);
> +        if (c) {
> +            env->features[FEAT_HV_NESTED_EAX] = c->eax;
> +        }
> +    }
> +
>      /* Features */
>      r |= hv_cpuid_check_and_set(cs, cpuid, "hv-relaxed",
> -                                cpu->hyperv_relaxed_timing);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vapic", cpu->hyperv_vapic);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-time", cpu->hyperv_time);
> +                                &cpu->hyperv_relaxed_timing);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vapic", &cpu->hyperv_vapic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-time", &cpu->hyperv_time);
>      r |= hv_cpuid_check_and_set(cs, cpuid, "hv-frequencies",
> -                                cpu->hyperv_frequencies);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-crash", cpu->hyperv_crash);
> +                                &cpu->hyperv_frequencies);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-crash", &cpu->hyperv_crash);
>      r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reenlightenment",
> -                                cpu->hyperv_reenlightenment);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reset", cpu->hyperv_reset);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vpindex", cpu->hyperv_vpindex);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-runtime", cpu->hyperv_runtime);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-synic", cpu->hyperv_synic);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-stimer", cpu->hyperv_stimer);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-tlbflush", cpu->hyperv_tlbflush);
> -    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-ipi", cpu->hyperv_ipi);
> +                                &cpu->hyperv_reenlightenment);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-reset", &cpu->hyperv_reset);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-vpindex", &cpu->hyperv_vpindex);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-runtime", &cpu->hyperv_runtime);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-synic", &cpu->hyperv_synic);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-stimer", &cpu->hyperv_stimer);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-tlbflush",
> +                                &cpu->hyperv_tlbflush);
> +    r |= hv_cpuid_check_and_set(cs, cpuid, "hv-ipi", &cpu->hyperv_ipi);
>  
>      /* Dependencies */
>      if (cpu->hyperv_synic && !cpu->hyperv_synic_kvm_only &&
> -        !cpu->hyperv_vpindex) {
> -        fprintf(stderr, "Hyper-V SynIC "
> -                "(requested by 'hv-synic' cpu flag) "
> -                "requires Hyper-V VP_INDEX ('hv-vpindex')\n");
> -        r |= 1;
> -    }
> +        !cpu->hyperv_vpindex)
> +        r |= hv_report_missing_dep(cpu, "hv-synic", "hv-vpindex");
>  
>      /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
>      env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
> @@ -1121,6 +1192,12 @@ static int hyperv_handle_properties(CPUState *cs,
>          goto free;
>      }
>  
> +    if (cpu->hyperv_all) {
> +        /* We already copied all feature words from KVM as is */
> +        r = cpuid->nent;
> +        goto free;
> +    }
> +
>      c = &cpuid_ent[cpuid_i++];
>      c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
>      if (!cpu->hyperv_vendor_id) {
> @@ -1192,11 +1269,26 @@ free:
>      return r;
>  }
>  
> +static Error *hv_all_mig_blocker;
> +
>  static int hyperv_init_vcpu(X86CPU *cpu)
>  {
>      CPUState *cs = CPU(cpu);
> +    Error *local_err = NULL;
>      int ret;
>  
> +    if (cpu->hyperv_all && hv_all_mig_blocker == NULL) {
> +        error_setg(&hv_all_mig_blocker,
> +                   "'hv-all' CPU flag prevents migration, use explicit set of "
> +                   "hv-* flags instead");
> +        ret = migrate_add_blocker(hv_all_mig_blocker, &local_err);
> +        if (local_err) {
> +            error_report_err(local_err);
> +            error_free(hv_all_mig_blocker);
> +            return ret;
> +        }
> +    }
> +
>      if (cpu->hyperv_vpindex && !hv_vpindex_settable) {
>          /*
>           * the kernel doesn't support setting vp_index; assert that its value


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] i386/kvm: implement 'hv-all' pass-through mode
@ 2019-04-05 17:00       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 10+ messages in thread
From: Vitaly Kuznetsov @ 2019-04-05 17:00 UTC (permalink / raw)
  To: Roman Kagan
  Cc: qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, Dr . David Alan Gilbert,
	Daniel P . Berrangé

Roman Kagan <rkagan@virtuozzo.com> writes:

> On Fri, Mar 29, 2019 at 03:18:28PM +0100, Vitaly Kuznetsov wrote:
>> In many case we just want to give Windows guests all currently supported
>> Hyper-V enlightenments and that's where this new mode may come handy. We
>> pass through what was returned by KVM_GET_SUPPORTED_HV_CPUID.
>
> The only one out of those "many cases" I can think of is when you've
> developed a new hyperv feature in the kernel and you want to test it
> with a version of QEMU that's not aware of it.  Are there any others?
>

I can recall the following case I had: benchmark Windows guest
performance with different kernels like try to get the best number. As
these kernels were supporting different set of hv-* enlightenments I had
to do non-trivial work to figure out what's supported and adjust QEMU
command line accordingly.

Would've been much easier with 'hv-all'

>> 
>> hv_cpuid_check_and_set() is modified to also set cpu->hyperv_* flags as
>> we may want to check them later (and we actually do for hv_runtime,
>> hv_synic,...).
>> 
>> 'hv-all' is a development only feature, a migration blocker is added to
>> prevent issues while migrating between hosts with different feature sets.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  docs/hyperv.txt   |  10 ++++
>>  target/i386/cpu.c |   1 +
>>  target/i386/cpu.h |   1 +
>>  target/i386/kvm.c | 148 +++++++++++++++++++++++++++++++++++++---------
>>  4 files changed, 132 insertions(+), 28 deletions(-)
>> 
>> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> index 397f2517b8..d1299aba81 100644
>> --- a/docs/hyperv.txt
>> +++ b/docs/hyperv.txt
>> @@ -174,6 +174,16 @@ without the feature to find out if enabling it is beneficial.
>>  Requires: hv-vapic
>>  
>>  
>> +4. Development features
>> +========================
>> +In some cases (e.g. during development) it may make sense to use QEMU in
>> +'pass-through' mode and give Windows guests all enlightenments currently
>> +supported by KVM. This pass-through mode is enabled by "hv-all" CPU flag.
>> +Note: enabling this flag effectively prevents migration as supported features
>> +may differ between target and destination.
>
> I find 'hv-passthrough' a more adequate name for this.

Sure, will adjust.

>
>> +Note: "hv-all" doesn't include 'hv-evmcs', it needs to be enabled explicitly.
>
> This is extremely confusing, when some features are more equal than
> others.  I think it'd make more sense instead to support filtering out
> some features, like in "hv-passthrough,hv-evmcs=off".

hv-evmcs is probably the only enlightenment which is not an obvious
'win': when enabled, some features (e.g. posted interrupts) are getting
disabled. But as 'hv-all' is now a developer-only feature I see no
problem with enabling evmcs too.

-- 
Vitaly

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] i386/kvm: implement 'hv-all' pass-through mode
@ 2019-04-05 17:00       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 10+ messages in thread
From: Vitaly Kuznetsov @ 2019-04-05 17:00 UTC (permalink / raw)
  To: Roman Kagan
  Cc: Eduardo Habkost, Marcelo Tosatti, Dr . David Alan Gilbert,
	qemu-devel, Paolo Bonzini, Richard Henderson

Roman Kagan <rkagan@virtuozzo.com> writes:

> On Fri, Mar 29, 2019 at 03:18:28PM +0100, Vitaly Kuznetsov wrote:
>> In many case we just want to give Windows guests all currently supported
>> Hyper-V enlightenments and that's where this new mode may come handy. We
>> pass through what was returned by KVM_GET_SUPPORTED_HV_CPUID.
>
> The only one out of those "many cases" I can think of is when you've
> developed a new hyperv feature in the kernel and you want to test it
> with a version of QEMU that's not aware of it.  Are there any others?
>

I can recall the following case I had: benchmark Windows guest
performance with different kernels like try to get the best number. As
these kernels were supporting different set of hv-* enlightenments I had
to do non-trivial work to figure out what's supported and adjust QEMU
command line accordingly.

Would've been much easier with 'hv-all'

>> 
>> hv_cpuid_check_and_set() is modified to also set cpu->hyperv_* flags as
>> we may want to check them later (and we actually do for hv_runtime,
>> hv_synic,...).
>> 
>> 'hv-all' is a development only feature, a migration blocker is added to
>> prevent issues while migrating between hosts with different feature sets.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  docs/hyperv.txt   |  10 ++++
>>  target/i386/cpu.c |   1 +
>>  target/i386/cpu.h |   1 +
>>  target/i386/kvm.c | 148 +++++++++++++++++++++++++++++++++++++---------
>>  4 files changed, 132 insertions(+), 28 deletions(-)
>> 
>> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> index 397f2517b8..d1299aba81 100644
>> --- a/docs/hyperv.txt
>> +++ b/docs/hyperv.txt
>> @@ -174,6 +174,16 @@ without the feature to find out if enabling it is beneficial.
>>  Requires: hv-vapic
>>  
>>  
>> +4. Development features
>> +========================
>> +In some cases (e.g. during development) it may make sense to use QEMU in
>> +'pass-through' mode and give Windows guests all enlightenments currently
>> +supported by KVM. This pass-through mode is enabled by "hv-all" CPU flag.
>> +Note: enabling this flag effectively prevents migration as supported features
>> +may differ between target and destination.
>
> I find 'hv-passthrough' a more adequate name for this.

Sure, will adjust.

>
>> +Note: "hv-all" doesn't include 'hv-evmcs', it needs to be enabled explicitly.
>
> This is extremely confusing, when some features are more equal than
> others.  I think it'd make more sense instead to support filtering out
> some features, like in "hv-passthrough,hv-evmcs=off".

hv-evmcs is probably the only enlightenment which is not an obvious
'win': when enabled, some features (e.g. posted interrupts) are getting
disabled. But as 'hv-all' is now a developer-only feature I see no
problem with enabling evmcs too.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-04-05 17:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190329141832.22882-1-vkuznets@redhat.com>
     [not found] ` <20190329141832.22882-2-vkuznets@redhat.com>
2019-04-05 13:05   ` [Qemu-devel] [PATCH 1/8] i386/kvm: add support for KVM_GET_SUPPORTED_HV_CPUID Roman Kagan
2019-04-05 13:05     ` Roman Kagan
     [not found] ` <20190329141832.22882-4-vkuznets@redhat.com>
2019-04-05 13:19   ` [Qemu-devel] [PATCH 3/8] i386/kvm: document existing Hyper-V enlightenments Roman Kagan
2019-04-05 13:19     ` Roman Kagan
2019-04-05 14:29     ` Vitaly Kuznetsov
2019-04-05 14:29       ` Vitaly Kuznetsov
     [not found] ` <20190329141832.22882-5-vkuznets@redhat.com>
2019-04-05 15:07   ` [Qemu-devel] [PATCH 4/8] i386/kvm: implement 'hv-all' pass-through mode Roman Kagan
2019-04-05 15:07     ` Roman Kagan
2019-04-05 17:00     ` Vitaly Kuznetsov
2019-04-05 17:00       ` Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.