All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option
@ 2021-01-07 15:06 Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 01/19] linux-headers: update against 5.11-rc2 Vitaly Kuznetsov
                   ` (18 more replies)
  0 siblings, 19 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

This series is a successor of "[PATCH RFC v3 00/23] i386: KVM: expand
Hyper-V features early" and "[PATCH v2 2/2] i386: provide simple 'hyperv=on'"
option to x86 machine types".

Changes:
- Make 'hv-default' a CPU option and not a machine type option [Igor].
- Introduce a simple qtest [Eduardo].
- Update linux headers from 5.11-rc2.

Description:

Upper layer tools like libvirt want to figure out which Hyper-V features are
supported by the underlying stack (QEMU/KVM) but currently they are unable to
do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
no effect on e.g. QMP's 

query-cpu-model-expansion type=full model={"name":"host","props":{"hv-passthrough":true}}

command as we parse Hyper-V features after creating KVM vCPUs and not at
feature expansion time. To support the use-case we first need to make 
KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
vCPU version can't be used that early. This is what KVM part does. With
that done, we can make early Hyper-V feature expansion (this series).

In addition, provide a simple 'hv-default' option which enables (and
requires from KVM) all currently supported Hyper-V enlightenments.
Unlike 'hv_passthrough' mode, this is going to be migratable.

Vitaly Kuznetsov (19):
  linux-headers: update against 5.11-rc2
  i386: introduce kvm_hv_evmcs_available()
  i386: keep hyperv_vendor string up-to-date
  i386: invert hyperv_spinlock_attempts setting logic with
    hv_passthrough
  i386: always fill Hyper-V CPUID feature leaves from X86CPU data
  i386: stop using env->features[] for filling Hyper-V CPUIDs
  i386: introduce hyperv_feature_supported()
  i386: introduce hv_cpuid_get_host()
  i386: drop FEAT_HYPERV feature leaves
  i386: introduce hv_cpuid_cache
  i386: split hyperv_handle_properties() into
    hyperv_expand_features()/hyperv_fill_cpuids()
  i386: move eVMCS enablement to hyperv_init_vcpu()
  i386: switch hyperv_expand_features() to using error_setg()
  i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size
  i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one
  i386: use global kvm_state in hyperv_enabled() check
  i386: expand Hyper-V features during CPU feature expansion time
  i386: provide simple 'hv-default=on' option
  qtest/hyperv: Introduce a simple hyper-v test

 MAINTAINERS                                   |   1 +
 docs/hyperv.txt                               |  16 +-
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h   |   2 +-
 include/standard-headers/drm/drm_fourcc.h     | 175 +++++-
 include/standard-headers/linux/const.h        |  36 ++
 include/standard-headers/linux/ethtool.h      |   2 +-
 include/standard-headers/linux/fuse.h         |  30 +-
 include/standard-headers/linux/kernel.h       |   9 +-
 include/standard-headers/linux/pci_regs.h     |  16 +
 include/standard-headers/linux/vhost_types.h  |   9 +
 include/standard-headers/linux/virtio_gpu.h   |  82 +++
 include/standard-headers/linux/virtio_ids.h   |  44 +-
 linux-headers/asm-arm64/kvm.h                 |   3 -
 linux-headers/asm-generic/unistd.h            |   6 +-
 linux-headers/asm-mips/unistd_n32.h           |   1 +
 linux-headers/asm-mips/unistd_n64.h           |   1 +
 linux-headers/asm-mips/unistd_o32.h           |   1 +
 linux-headers/asm-powerpc/unistd_32.h         |   1 +
 linux-headers/asm-powerpc/unistd_64.h         |   1 +
 linux-headers/asm-s390/unistd_32.h            |   1 +
 linux-headers/asm-s390/unistd_64.h            |   1 +
 linux-headers/asm-x86/kvm.h                   |   1 +
 linux-headers/asm-x86/unistd_32.h             |   1 +
 linux-headers/asm-x86/unistd_64.h             |   1 +
 linux-headers/asm-x86/unistd_x32.h            |   1 +
 linux-headers/linux/kvm.h                     |  56 +-
 linux-headers/linux/userfaultfd.h             |   9 +
 linux-headers/linux/vfio.h                    |   1 +
 linux-headers/linux/vhost.h                   |   4 +
 scripts/update-linux-headers.sh               |   5 +-
 target/i386/cpu.c                             | 151 ++---
 target/i386/cpu.h                             |  11 +-
 target/i386/kvm/kvm-stub.c                    |  10 +
 target/i386/kvm/kvm.c                         | 524 ++++++++++--------
 target/i386/kvm/kvm_i386.h                    |   2 +
 tests/qtest/hyperv-test.c                     | 238 ++++++++
 tests/qtest/meson.build                       |   3 +-
 37 files changed, 1074 insertions(+), 382 deletions(-)
 create mode 100644 include/standard-headers/linux/const.h
 create mode 100644 tests/qtest/hyperv-test.c

-- 
2.29.2



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 01/19] linux-headers: update against 5.11-rc2
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 02/19] i386: introduce kvm_hv_evmcs_available() Vitaly Kuznetsov
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

commit e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62

<linux/const.h> needs to be included as some constants were
moved from <linux/kernel.h>.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h   |   2 +-
 include/standard-headers/drm/drm_fourcc.h     | 175 +++++++++++++++++-
 include/standard-headers/linux/const.h        |  36 ++++
 include/standard-headers/linux/ethtool.h      |   2 +-
 include/standard-headers/linux/fuse.h         |  30 ++-
 include/standard-headers/linux/kernel.h       |   9 +-
 include/standard-headers/linux/pci_regs.h     |  16 ++
 include/standard-headers/linux/vhost_types.h  |   9 +
 include/standard-headers/linux/virtio_gpu.h   |  82 ++++++++
 include/standard-headers/linux/virtio_ids.h   |  44 +++--
 linux-headers/asm-arm64/kvm.h                 |   3 -
 linux-headers/asm-generic/unistd.h            |   6 +-
 linux-headers/asm-mips/unistd_n32.h           |   1 +
 linux-headers/asm-mips/unistd_n64.h           |   1 +
 linux-headers/asm-mips/unistd_o32.h           |   1 +
 linux-headers/asm-powerpc/unistd_32.h         |   1 +
 linux-headers/asm-powerpc/unistd_64.h         |   1 +
 linux-headers/asm-s390/unistd_32.h            |   1 +
 linux-headers/asm-s390/unistd_64.h            |   1 +
 linux-headers/asm-x86/kvm.h                   |   1 +
 linux-headers/asm-x86/unistd_32.h             |   1 +
 linux-headers/asm-x86/unistd_64.h             |   1 +
 linux-headers/asm-x86/unistd_x32.h            |   1 +
 linux-headers/linux/kvm.h                     |  56 +++++-
 linux-headers/linux/userfaultfd.h             |   9 +
 linux-headers/linux/vfio.h                    |   1 +
 linux-headers/linux/vhost.h                   |   4 +
 scripts/update-linux-headers.sh               |   5 +-
 28 files changed, 458 insertions(+), 42 deletions(-)
 create mode 100644 include/standard-headers/linux/const.h

diff --git a/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h b/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
index 0a8c7c931199..1677208a411f 100644
--- a/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
+++ b/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
@@ -176,7 +176,7 @@ struct pvrdma_port_attr {
 	uint8_t			subnet_timeout;
 	uint8_t			init_type_reply;
 	uint8_t			active_width;
-	uint16_t			active_speed;
+	uint8_t			active_speed;
 	uint8_t			phys_state;
 	uint8_t			reserved[2];
 };
diff --git a/include/standard-headers/drm/drm_fourcc.h b/include/standard-headers/drm/drm_fourcc.h
index 0de1a552cab2..c47e19810c05 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -57,6 +57,30 @@ extern "C" {
  * may preserve meaning - such as number of planes - from the fourcc code,
  * whereas others may not.
  *
+ * Modifiers must uniquely encode buffer layout. In other words, a buffer must
+ * match only a single modifier. A modifier must not be a subset of layouts of
+ * another modifier. For instance, it's incorrect to encode pitch alignment in
+ * a modifier: a buffer may match a 64-pixel aligned modifier and a 32-pixel
+ * aligned modifier. That said, modifiers can have implicit minimal
+ * requirements.
+ *
+ * For modifiers where the combination of fourcc code and modifier can alias,
+ * a canonical pair needs to be defined and used by all drivers. Preferred
+ * combinations are also encouraged where all combinations might lead to
+ * confusion and unnecessarily reduced interoperability. An example for the
+ * latter is AFBC, where the ABGR layouts are preferred over ARGB layouts.
+ *
+ * There are two kinds of modifier users:
+ *
+ * - Kernel and user-space drivers: for drivers it's important that modifiers
+ *   don't alias, otherwise two drivers might support the same format but use
+ *   different aliases, preventing them from sharing buffers in an efficient
+ *   format.
+ * - Higher-level programs interfacing with KMS/GBM/EGL/Vulkan/etc: these users
+ *   see modifiers as opaque tokens they can check for equality and intersect.
+ *   These users musn't need to know to reason about the modifier value
+ *   (i.e. they are not expected to extract information out of the modifier).
+ *
  * Vendors should document their modifier usage in as much detail as
  * possible, to ensure maximum compatibility across devices, drivers and
  * applications.
@@ -154,6 +178,12 @@ extern "C" {
 #define DRM_FORMAT_ARGB16161616F fourcc_code('A', 'R', '4', 'H') /* [63:0] A:R:G:B 16:16:16:16 little endian */
 #define DRM_FORMAT_ABGR16161616F fourcc_code('A', 'B', '4', 'H') /* [63:0] A:B:G:R 16:16:16:16 little endian */
 
+/*
+ * RGBA format with 10-bit components packed in 64-bit per pixel, with 6 bits
+ * of unused padding per component:
+ */
+#define DRM_FORMAT_AXBXGXRX106106106106 fourcc_code('A', 'B', '1', '0') /* [63:0] A:x:B:x:G:x:R:x 10:6:10:6:10:6:10:6 little endian */
+
 /* packed YCbCr */
 #define DRM_FORMAT_YUYV		fourcc_code('Y', 'U', 'Y', 'V') /* [31:0] Cr0:Y1:Cb0:Y0 8:8:8:8 little endian */
 #define DRM_FORMAT_YVYU		fourcc_code('Y', 'V', 'Y', 'U') /* [31:0] Cb0:Y1:Cr0:Y0 8:8:8:8 little endian */
@@ -319,7 +349,6 @@ extern "C" {
  */
 
 /* Vendor Ids: */
-#define DRM_FORMAT_MOD_NONE           0
 #define DRM_FORMAT_MOD_VENDOR_NONE    0
 #define DRM_FORMAT_MOD_VENDOR_INTEL   0x01
 #define DRM_FORMAT_MOD_VENDOR_AMD     0x02
@@ -391,6 +420,16 @@ extern "C" {
  */
 #define DRM_FORMAT_MOD_LINEAR	fourcc_mod_code(NONE, 0)
 
+/*
+ * Deprecated: use DRM_FORMAT_MOD_LINEAR instead
+ *
+ * The "none" format modifier doesn't actually mean that the modifier is
+ * implicit, instead it means that the layout is linear. Whether modifiers are
+ * used is out-of-band information carried in an API-specific way (e.g. in a
+ * flag for drm_mode_fb_cmd2).
+ */
+#define DRM_FORMAT_MOD_NONE	0
+
 /* Intel framebuffer modifiers */
 
 /*
@@ -1055,6 +1094,140 @@ drm_fourcc_canonicalize_nvidia_format_mod(uint64_t modifier)
  */
 #define AMLOGIC_FBC_OPTION_MEM_SAVING		(1ULL << 0)
 
+/*
+ * AMD modifiers
+ *
+ * Memory layout:
+ *
+ * without DCC:
+ *   - main surface
+ *
+ * with DCC & without DCC_RETILE:
+ *   - main surface in plane 0
+ *   - DCC surface in plane 1 (RB-aligned, pipe-aligned if DCC_PIPE_ALIGN is set)
+ *
+ * with DCC & DCC_RETILE:
+ *   - main surface in plane 0
+ *   - displayable DCC surface in plane 1 (not RB-aligned & not pipe-aligned)
+ *   - pipe-aligned DCC surface in plane 2 (RB-aligned & pipe-aligned)
+ *
+ * For multi-plane formats the above surfaces get merged into one plane for
+ * each format plane, based on the required alignment only.
+ *
+ * Bits  Parameter                Notes
+ * ----- ------------------------ ---------------------------------------------
+ *
+ *   7:0 TILE_VERSION             Values are AMD_FMT_MOD_TILE_VER_*
+ *  12:8 TILE                     Values are AMD_FMT_MOD_TILE_<version>_*
+ *    13 DCC
+ *    14 DCC_RETILE
+ *    15 DCC_PIPE_ALIGN
+ *    16 DCC_INDEPENDENT_64B
+ *    17 DCC_INDEPENDENT_128B
+ * 19:18 DCC_MAX_COMPRESSED_BLOCK Values are AMD_FMT_MOD_DCC_BLOCK_*
+ *    20 DCC_CONSTANT_ENCODE
+ * 23:21 PIPE_XOR_BITS            Only for some chips
+ * 26:24 BANK_XOR_BITS            Only for some chips
+ * 29:27 PACKERS                  Only for some chips
+ * 32:30 RB                       Only for some chips
+ * 35:33 PIPE                     Only for some chips
+ * 55:36 -                        Reserved for future use, must be zero
+ */
+#define AMD_FMT_MOD fourcc_mod_code(AMD, 0)
+
+#define IS_AMD_FMT_MOD(val) (((val) >> 56) == DRM_FORMAT_MOD_VENDOR_AMD)
+
+/* Reserve 0 for GFX8 and older */
+#define AMD_FMT_MOD_TILE_VER_GFX9 1
+#define AMD_FMT_MOD_TILE_VER_GFX10 2
+#define AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS 3
+
+/*
+ * 64K_S is the same for GFX9/GFX10/GFX10_RBPLUS and hence has GFX9 as canonical
+ * version.
+ */
+#define AMD_FMT_MOD_TILE_GFX9_64K_S 9
+
+/*
+ * 64K_D for non-32 bpp is the same for GFX9/GFX10/GFX10_RBPLUS and hence has
+ * GFX9 as canonical version.
+ */
+#define AMD_FMT_MOD_TILE_GFX9_64K_D 10
+#define AMD_FMT_MOD_TILE_GFX9_64K_S_X 25
+#define AMD_FMT_MOD_TILE_GFX9_64K_D_X 26
+#define AMD_FMT_MOD_TILE_GFX9_64K_R_X 27
+
+#define AMD_FMT_MOD_DCC_BLOCK_64B 0
+#define AMD_FMT_MOD_DCC_BLOCK_128B 1
+#define AMD_FMT_MOD_DCC_BLOCK_256B 2
+
+#define AMD_FMT_MOD_TILE_VERSION_SHIFT 0
+#define AMD_FMT_MOD_TILE_VERSION_MASK 0xFF
+#define AMD_FMT_MOD_TILE_SHIFT 8
+#define AMD_FMT_MOD_TILE_MASK 0x1F
+
+/* Whether DCC compression is enabled. */
+#define AMD_FMT_MOD_DCC_SHIFT 13
+#define AMD_FMT_MOD_DCC_MASK 0x1
+
+/*
+ * Whether to include two DCC surfaces, one which is rb & pipe aligned, and
+ * one which is not-aligned.
+ */
+#define AMD_FMT_MOD_DCC_RETILE_SHIFT 14
+#define AMD_FMT_MOD_DCC_RETILE_MASK 0x1
+
+/* Only set if DCC_RETILE = false */
+#define AMD_FMT_MOD_DCC_PIPE_ALIGN_SHIFT 15
+#define AMD_FMT_MOD_DCC_PIPE_ALIGN_MASK 0x1
+
+#define AMD_FMT_MOD_DCC_INDEPENDENT_64B_SHIFT 16
+#define AMD_FMT_MOD_DCC_INDEPENDENT_64B_MASK 0x1
+#define AMD_FMT_MOD_DCC_INDEPENDENT_128B_SHIFT 17
+#define AMD_FMT_MOD_DCC_INDEPENDENT_128B_MASK 0x1
+#define AMD_FMT_MOD_DCC_MAX_COMPRESSED_BLOCK_SHIFT 18
+#define AMD_FMT_MOD_DCC_MAX_COMPRESSED_BLOCK_MASK 0x3
+
+/*
+ * DCC supports embedding some clear colors directly in the DCC surface.
+ * However, on older GPUs the rendering HW ignores the embedded clear color
+ * and prefers the driver provided color. This necessitates doing a fastclear
+ * eliminate operation before a process transfers control.
+ *
+ * If this bit is set that means the fastclear eliminate is not needed for these
+ * embeddable colors.
+ */
+#define AMD_FMT_MOD_DCC_CONSTANT_ENCODE_SHIFT 20
+#define AMD_FMT_MOD_DCC_CONSTANT_ENCODE_MASK 0x1
+
+/*
+ * The below fields are for accounting for per GPU differences. These are only
+ * relevant for GFX9 and later and if the tile field is *_X/_T.
+ *
+ * PIPE_XOR_BITS = always needed
+ * BANK_XOR_BITS = only for TILE_VER_GFX9
+ * PACKERS = only for TILE_VER_GFX10_RBPLUS
+ * RB = only for TILE_VER_GFX9 & DCC
+ * PIPE = only for TILE_VER_GFX9 & DCC & (DCC_RETILE | DCC_PIPE_ALIGN)
+ */
+#define AMD_FMT_MOD_PIPE_XOR_BITS_SHIFT 21
+#define AMD_FMT_MOD_PIPE_XOR_BITS_MASK 0x7
+#define AMD_FMT_MOD_BANK_XOR_BITS_SHIFT 24
+#define AMD_FMT_MOD_BANK_XOR_BITS_MASK 0x7
+#define AMD_FMT_MOD_PACKERS_SHIFT 27
+#define AMD_FMT_MOD_PACKERS_MASK 0x7
+#define AMD_FMT_MOD_RB_SHIFT 30
+#define AMD_FMT_MOD_RB_MASK 0x7
+#define AMD_FMT_MOD_PIPE_SHIFT 33
+#define AMD_FMT_MOD_PIPE_MASK 0x7
+
+#define AMD_FMT_MOD_SET(field, value) \
+	((uint64_t)(value) << AMD_FMT_MOD_##field##_SHIFT)
+#define AMD_FMT_MOD_GET(field, value) \
+	(((value) >> AMD_FMT_MOD_##field##_SHIFT) & AMD_FMT_MOD_##field##_MASK)
+#define AMD_FMT_MOD_CLEAR(field) \
+	(~((uint64_t)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/standard-headers/linux/const.h b/include/standard-headers/linux/const.h
new file mode 100644
index 000000000000..5e4898725168
--- /dev/null
+++ b/include/standard-headers/linux/const.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* const.h: Macros for dealing with constants.  */
+
+#ifndef _LINUX_CONST_H
+#define _LINUX_CONST_H
+
+/* Some constant macros are used in both assembler and
+ * C code.  Therefore we cannot annotate them always with
+ * 'UL' and other type specifiers unilaterally.  We
+ * use the following macros to deal with this.
+ *
+ * Similarly, _AT() will cast an expression with a type in C, but
+ * leave it unchanged in asm.
+ */
+
+#ifdef __ASSEMBLY__
+#define _AC(X,Y)	X
+#define _AT(T,X)	X
+#else
+#define __AC(X,Y)	(X##Y)
+#define _AC(X,Y)	__AC(X,Y)
+#define _AT(T,X)	((T)(X))
+#endif
+
+#define _UL(x)		(_AC(x, UL))
+#define _ULL(x)		(_AC(x, ULL))
+
+#define _BITUL(x)	(_UL(1) << (x))
+#define _BITULL(x)	(_ULL(1) << (x))
+
+#define __ALIGN_KERNEL(x, a)		__ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
+#define __ALIGN_KERNEL_MASK(x, mask)	(((x) + (mask)) & ~(mask))
+
+#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+
+#endif /* _LINUX_CONST_H */
diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h
index 0df22f7538e3..8bfd01d230da 100644
--- a/include/standard-headers/linux/ethtool.h
+++ b/include/standard-headers/linux/ethtool.h
@@ -16,7 +16,7 @@
 
 #include "net/eth.h"
 
-#include "standard-headers/linux/kernel.h"
+#include "standard-headers/linux/const.h"
 #include "standard-headers/linux/types.h"
 #include "standard-headers/linux/if_ether.h"
 
diff --git a/include/standard-headers/linux/fuse.h b/include/standard-headers/linux/fuse.h
index 82c0a38b591e..950d7edb7ef6 100644
--- a/include/standard-headers/linux/fuse.h
+++ b/include/standard-headers/linux/fuse.h
@@ -175,6 +175,10 @@
  *
  *  7.32
  *  - add flags to fuse_attr, add FUSE_ATTR_SUBMOUNT, add FUSE_SUBMOUNTS
+ *
+ *  7.33
+ *  - add FUSE_HANDLE_KILLPRIV_V2, FUSE_WRITE_KILL_SUIDGID, FATTR_KILL_SUIDGID
+ *  - add FUSE_OPEN_KILL_SUIDGID
  */
 
 #ifndef _LINUX_FUSE_H
@@ -206,7 +210,7 @@
 #define FUSE_KERNEL_VERSION 7
 
 /** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 32
+#define FUSE_KERNEL_MINOR_VERSION 33
 
 /** The node ID of the root inode */
 #define FUSE_ROOT_ID 1
@@ -267,6 +271,7 @@ struct fuse_file_lock {
 #define FATTR_MTIME_NOW	(1 << 8)
 #define FATTR_LOCKOWNER	(1 << 9)
 #define FATTR_CTIME	(1 << 10)
+#define FATTR_KILL_SUIDGID	(1 << 11)
 
 /**
  * Flags returned by the OPEN request
@@ -316,6 +321,11 @@ struct fuse_file_lock {
  *		       foffset and moffset fields in struct
  *		       fuse_setupmapping_out and fuse_removemapping_one.
  * FUSE_SUBMOUNTS: kernel supports auto-mounting directory submounts
+ * FUSE_HANDLE_KILLPRIV_V2: fs kills suid/sgid/cap on write/chown/trunc.
+ *			Upon write/truncate suid/sgid is only killed if caller
+ *			does not have CAP_FSETID. Additionally upon
+ *			write/truncate sgid is killed only if file has group
+ *			execute permission. (Same as Linux VFS behavior).
  */
 #define FUSE_ASYNC_READ		(1 << 0)
 #define FUSE_POSIX_LOCKS	(1 << 1)
@@ -345,6 +355,7 @@ struct fuse_file_lock {
 #define FUSE_EXPLICIT_INVAL_DATA (1 << 25)
 #define FUSE_MAP_ALIGNMENT	(1 << 26)
 #define FUSE_SUBMOUNTS		(1 << 27)
+#define FUSE_HANDLE_KILLPRIV_V2	(1 << 28)
 
 /**
  * CUSE INIT request/reply flags
@@ -374,11 +385,14 @@ struct fuse_file_lock {
  *
  * FUSE_WRITE_CACHE: delayed write from page cache, file handle is guessed
  * FUSE_WRITE_LOCKOWNER: lock_owner field is valid
- * FUSE_WRITE_KILL_PRIV: kill suid and sgid bits
+ * FUSE_WRITE_KILL_SUIDGID: kill suid and sgid bits
  */
 #define FUSE_WRITE_CACHE	(1 << 0)
 #define FUSE_WRITE_LOCKOWNER	(1 << 1)
-#define FUSE_WRITE_KILL_PRIV	(1 << 2)
+#define FUSE_WRITE_KILL_SUIDGID (1 << 2)
+
+/* Obsolete alias; this flag implies killing suid/sgid only. */
+#define FUSE_WRITE_KILL_PRIV	FUSE_WRITE_KILL_SUIDGID
 
 /**
  * Read flags
@@ -427,6 +441,12 @@ struct fuse_file_lock {
  */
 #define FUSE_ATTR_SUBMOUNT      (1 << 0)
 
+/**
+ * Open flags
+ * FUSE_OPEN_KILL_SUIDGID: Kill suid and sgid if executable
+ */
+#define FUSE_OPEN_KILL_SUIDGID	(1 << 0)
+
 enum fuse_opcode {
 	FUSE_LOOKUP		= 1,
 	FUSE_FORGET		= 2,  /* no reply */
@@ -588,14 +608,14 @@ struct fuse_setattr_in {
 
 struct fuse_open_in {
 	uint32_t	flags;
-	uint32_t	unused;
+	uint32_t	open_flags;	/* FUSE_OPEN_... */
 };
 
 struct fuse_create_in {
 	uint32_t	flags;
 	uint32_t	mode;
 	uint32_t	umask;
-	uint32_t	padding;
+	uint32_t	open_flags;	/* FUSE_OPEN_... */
 };
 
 struct fuse_open_out {
diff --git a/include/standard-headers/linux/kernel.h b/include/standard-headers/linux/kernel.h
index 1eeba2ef9242..7848c5ae25e2 100644
--- a/include/standard-headers/linux/kernel.h
+++ b/include/standard-headers/linux/kernel.h
@@ -3,13 +3,6 @@
 #define _LINUX_KERNEL_H
 
 #include "standard-headers/linux/sysinfo.h"
-
-/*
- * 'kernel.h' contains some often-used function prototypes etc
- */
-#define __ALIGN_KERNEL(x, a)		__ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
-#define __ALIGN_KERNEL_MASK(x, mask)	(((x) + (mask)) & ~(mask))
-
-#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+#include "standard-headers/linux/const.h"
 
 #endif /* _LINUX_KERNEL_H */
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index a95d55f9f257..e709ae8235e7 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -531,6 +531,7 @@
 #define  PCI_EXP_LNKCAP_SLS_8_0GB 0x00000003 /* LNKCAP2 SLS Vector bit 2 */
 #define  PCI_EXP_LNKCAP_SLS_16_0GB 0x00000004 /* LNKCAP2 SLS Vector bit 3 */
 #define  PCI_EXP_LNKCAP_SLS_32_0GB 0x00000005 /* LNKCAP2 SLS Vector bit 4 */
+#define  PCI_EXP_LNKCAP_SLS_64_0GB 0x00000006 /* LNKCAP2 SLS Vector bit 5 */
 #define  PCI_EXP_LNKCAP_MLW	0x000003f0 /* Maximum Link Width */
 #define  PCI_EXP_LNKCAP_ASPMS	0x00000c00 /* ASPM Support */
 #define  PCI_EXP_LNKCAP_ASPM_L0S 0x00000400 /* ASPM L0s Support */
@@ -562,6 +563,7 @@
 #define  PCI_EXP_LNKSTA_CLS_8_0GB 0x0003 /* Current Link Speed 8.0GT/s */
 #define  PCI_EXP_LNKSTA_CLS_16_0GB 0x0004 /* Current Link Speed 16.0GT/s */
 #define  PCI_EXP_LNKSTA_CLS_32_0GB 0x0005 /* Current Link Speed 32.0GT/s */
+#define  PCI_EXP_LNKSTA_CLS_64_0GB 0x0006 /* Current Link Speed 64.0GT/s */
 #define  PCI_EXP_LNKSTA_NLW	0x03f0	/* Negotiated Link Width */
 #define  PCI_EXP_LNKSTA_NLW_X1	0x0010	/* Current Link Width x1 */
 #define  PCI_EXP_LNKSTA_NLW_X2	0x0020	/* Current Link Width x2 */
@@ -670,6 +672,7 @@
 #define  PCI_EXP_LNKCAP2_SLS_8_0GB	0x00000008 /* Supported Speed 8GT/s */
 #define  PCI_EXP_LNKCAP2_SLS_16_0GB	0x00000010 /* Supported Speed 16GT/s */
 #define  PCI_EXP_LNKCAP2_SLS_32_0GB	0x00000020 /* Supported Speed 32GT/s */
+#define  PCI_EXP_LNKCAP2_SLS_64_0GB	0x00000040 /* Supported Speed 64GT/s */
 #define  PCI_EXP_LNKCAP2_CROSSLINK	0x00000100 /* Crosslink supported */
 #define PCI_EXP_LNKCTL2		48	/* Link Control 2 */
 #define  PCI_EXP_LNKCTL2_TLS		0x000f
@@ -678,6 +681,7 @@
 #define  PCI_EXP_LNKCTL2_TLS_8_0GT	0x0003 /* Supported Speed 8GT/s */
 #define  PCI_EXP_LNKCTL2_TLS_16_0GT	0x0004 /* Supported Speed 16GT/s */
 #define  PCI_EXP_LNKCTL2_TLS_32_0GT	0x0005 /* Supported Speed 32GT/s */
+#define  PCI_EXP_LNKCTL2_TLS_64_0GT	0x0006 /* Supported Speed 64GT/s */
 #define  PCI_EXP_LNKCTL2_ENTER_COMP	0x0010 /* Enter Compliance */
 #define  PCI_EXP_LNKCTL2_TX_MARGIN	0x0380 /* Transmit Margin */
 #define  PCI_EXP_LNKCTL2_HASD		0x0020 /* HW Autonomous Speed Disable */
@@ -723,6 +727,7 @@
 #define PCI_EXT_CAP_ID_DPC	0x1D	/* Downstream Port Containment */
 #define PCI_EXT_CAP_ID_L1SS	0x1E	/* L1 PM Substates */
 #define PCI_EXT_CAP_ID_PTM	0x1F	/* Precision Time Measurement */
+#define PCI_EXT_CAP_ID_DVSEC	0x23	/* Designated Vendor-Specific */
 #define PCI_EXT_CAP_ID_DLF	0x25	/* Data Link Feature */
 #define PCI_EXT_CAP_ID_PL_16GT	0x26	/* Physical Layer 16.0 GT/s */
 #define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_PL_16GT
@@ -831,6 +836,13 @@
 #define  PCI_PWR_CAP_BUDGET(x)	((x) & 1)	/* Included in system budget */
 #define PCI_EXT_CAP_PWR_SIZEOF	16
 
+/* Root Complex Event Collector Endpoint Association  */
+#define PCI_RCEC_RCIEP_BITMAP	4	/* Associated Bitmap for RCiEPs */
+#define PCI_RCEC_BUSN		8	/* RCEC Associated Bus Numbers */
+#define  PCI_RCEC_BUSN_REG_VER	0x02	/* Least version with BUSN present */
+#define  PCI_RCEC_BUSN_NEXT(x)	(((x) >> 8) & 0xff)
+#define  PCI_RCEC_BUSN_LAST(x)	(((x) >> 16) & 0xff)
+
 /* Vendor-Specific (VSEC, PCI_EXT_CAP_ID_VNDR) */
 #define PCI_VNDR_HEADER		4	/* Vendor-Specific Header */
 #define  PCI_VNDR_HEADER_ID(x)	((x) & 0xffff)
@@ -1066,6 +1078,10 @@
 #define  PCI_L1SS_CTL1_LTR_L12_TH_SCALE	0xe0000000  /* LTR_L1.2_THRESHOLD_Scale */
 #define PCI_L1SS_CTL2		0x0c	/* Control 2 Register */
 
+/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
+#define PCI_DVSEC_HEADER1		0x4 /* Designated Vendor-Specific Header1 */
+#define PCI_DVSEC_HEADER2		0x8 /* Designated Vendor-Specific Header2 */
+
 /* Data Link Feature */
 #define PCI_DLF_CAP		0x04	/* Capabilities Register */
 #define  PCI_DLF_EXCHANGE_ENABLE	0x80000000  /* Data Link Feature Exchange Enable */
diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h
index 486630b33287..0bd2684a2ae4 100644
--- a/include/standard-headers/linux/vhost_types.h
+++ b/include/standard-headers/linux/vhost_types.h
@@ -138,6 +138,15 @@ struct vhost_vdpa_config {
 	uint8_t buf[0];
 };
 
+/* vhost vdpa IOVA range
+ * @first: First address that can be mapped by vhost-vDPA
+ * @last: Last address that can be mapped by vhost-vDPA
+ */
+struct vhost_vdpa_iova_range {
+	uint64_t first;
+	uint64_t last;
+};
+
 /* Feature bits */
 /* Log all write descriptors. Can be changed while device is active. */
 #define VHOST_F_LOG_ALL 26
diff --git a/include/standard-headers/linux/virtio_gpu.h b/include/standard-headers/linux/virtio_gpu.h
index 4183cdc74b33..1357e4774ea6 100644
--- a/include/standard-headers/linux/virtio_gpu.h
+++ b/include/standard-headers/linux/virtio_gpu.h
@@ -55,6 +55,11 @@
  */
 #define VIRTIO_GPU_F_RESOURCE_UUID       2
 
+/*
+ * VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
+ */
+#define VIRTIO_GPU_F_RESOURCE_BLOB       3
+
 enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_UNDEFINED = 0,
 
@@ -71,6 +76,8 @@ enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_CMD_GET_CAPSET,
 	VIRTIO_GPU_CMD_GET_EDID,
 	VIRTIO_GPU_CMD_RESOURCE_ASSIGN_UUID,
+	VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB,
+	VIRTIO_GPU_CMD_SET_SCANOUT_BLOB,
 
 	/* 3d commands */
 	VIRTIO_GPU_CMD_CTX_CREATE = 0x0200,
@@ -81,6 +88,8 @@ enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_CMD_TRANSFER_TO_HOST_3D,
 	VIRTIO_GPU_CMD_TRANSFER_FROM_HOST_3D,
 	VIRTIO_GPU_CMD_SUBMIT_3D,
+	VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB,
+	VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB,
 
 	/* cursor commands */
 	VIRTIO_GPU_CMD_UPDATE_CURSOR = 0x0300,
@@ -93,6 +102,7 @@ enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_RESP_OK_CAPSET,
 	VIRTIO_GPU_RESP_OK_EDID,
 	VIRTIO_GPU_RESP_OK_RESOURCE_UUID,
+	VIRTIO_GPU_RESP_OK_MAP_INFO,
 
 	/* error responses */
 	VIRTIO_GPU_RESP_ERR_UNSPEC = 0x1200,
@@ -103,6 +113,15 @@ enum virtio_gpu_ctrl_type {
 	VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER,
 };
 
+enum virtio_gpu_shm_id {
+	VIRTIO_GPU_SHM_ID_UNDEFINED = 0,
+	/*
+	 * VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB
+	 * VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB
+	 */
+	VIRTIO_GPU_SHM_ID_HOST_VISIBLE = 1
+};
+
 #define VIRTIO_GPU_FLAG_FENCE (1 << 0)
 
 struct virtio_gpu_ctrl_hdr {
@@ -359,4 +378,67 @@ struct virtio_gpu_resp_resource_uuid {
 	uint8_t uuid[16];
 };
 
+/* VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB */
+struct virtio_gpu_resource_create_blob {
+	struct virtio_gpu_ctrl_hdr hdr;
+	uint32_t resource_id;
+#define VIRTIO_GPU_BLOB_MEM_GUEST             0x0001
+#define VIRTIO_GPU_BLOB_MEM_HOST3D            0x0002
+#define VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST      0x0003
+
+#define VIRTIO_GPU_BLOB_FLAG_USE_MAPPABLE     0x0001
+#define VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE    0x0002
+#define VIRTIO_GPU_BLOB_FLAG_USE_CROSS_DEVICE 0x0004
+	/* zero is invalid blob mem */
+	uint32_t blob_mem;
+	uint32_t blob_flags;
+	uint32_t nr_entries;
+	uint64_t blob_id;
+	uint64_t size;
+	/*
+	 * sizeof(nr_entries * virtio_gpu_mem_entry) bytes follow
+	 */
+};
+
+/* VIRTIO_GPU_CMD_SET_SCANOUT_BLOB */
+struct virtio_gpu_set_scanout_blob {
+	struct virtio_gpu_ctrl_hdr hdr;
+	struct virtio_gpu_rect r;
+	uint32_t scanout_id;
+	uint32_t resource_id;
+	uint32_t width;
+	uint32_t height;
+	uint32_t format;
+	uint32_t padding;
+	uint32_t strides[4];
+	uint32_t offsets[4];
+};
+
+/* VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB */
+struct virtio_gpu_resource_map_blob {
+	struct virtio_gpu_ctrl_hdr hdr;
+	uint32_t resource_id;
+	uint32_t padding;
+	uint64_t offset;
+};
+
+/* VIRTIO_GPU_RESP_OK_MAP_INFO */
+#define VIRTIO_GPU_MAP_CACHE_MASK     0x0f
+#define VIRTIO_GPU_MAP_CACHE_NONE     0x00
+#define VIRTIO_GPU_MAP_CACHE_CACHED   0x01
+#define VIRTIO_GPU_MAP_CACHE_UNCACHED 0x02
+#define VIRTIO_GPU_MAP_CACHE_WC       0x03
+struct virtio_gpu_resp_map_info {
+	struct virtio_gpu_ctrl_hdr hdr;
+	uint32_t map_info;
+	uint32_t padding;
+};
+
+/* VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB */
+struct virtio_gpu_resource_unmap_blob {
+	struct virtio_gpu_ctrl_hdr hdr;
+	uint32_t resource_id;
+	uint32_t padding;
+};
+
 #endif
diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index b052355ac7a3..bc1c0621f5ed 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -29,24 +29,30 @@
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE. */
 
-#define VIRTIO_ID_NET		1 /* virtio net */
-#define VIRTIO_ID_BLOCK		2 /* virtio block */
-#define VIRTIO_ID_CONSOLE	3 /* virtio console */
-#define VIRTIO_ID_RNG		4 /* virtio rng */
-#define VIRTIO_ID_BALLOON	5 /* virtio balloon */
-#define VIRTIO_ID_RPMSG		7 /* virtio remote processor messaging */
-#define VIRTIO_ID_SCSI		8 /* virtio scsi */
-#define VIRTIO_ID_9P		9 /* 9p virtio console */
-#define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
-#define VIRTIO_ID_CAIF	       12 /* Virtio caif */
-#define VIRTIO_ID_GPU          16 /* virtio GPU */
-#define VIRTIO_ID_INPUT        18 /* virtio input */
-#define VIRTIO_ID_VSOCK        19 /* virtio vsock transport */
-#define VIRTIO_ID_CRYPTO       20 /* virtio crypto */
-#define VIRTIO_ID_IOMMU        23 /* virtio IOMMU */
-#define VIRTIO_ID_MEM          24 /* virtio mem */
-#define VIRTIO_ID_FS           26 /* virtio filesystem */
-#define VIRTIO_ID_PMEM         27 /* virtio pmem */
-#define VIRTIO_ID_MAC80211_HWSIM 29 /* virtio mac80211-hwsim */
+#define VIRTIO_ID_NET			1 /* virtio net */
+#define VIRTIO_ID_BLOCK			2 /* virtio block */
+#define VIRTIO_ID_CONSOLE		3 /* virtio console */
+#define VIRTIO_ID_RNG			4 /* virtio rng */
+#define VIRTIO_ID_BALLOON		5 /* virtio balloon */
+#define VIRTIO_ID_IOMEM			6 /* virtio ioMemory */
+#define VIRTIO_ID_RPMSG			7 /* virtio remote processor messaging */
+#define VIRTIO_ID_SCSI			8 /* virtio scsi */
+#define VIRTIO_ID_9P			9 /* 9p virtio console */
+#define VIRTIO_ID_MAC80211_WLAN		10 /* virtio WLAN MAC */
+#define VIRTIO_ID_RPROC_SERIAL		11 /* virtio remoteproc serial link */
+#define VIRTIO_ID_CAIF			12 /* Virtio caif */
+#define VIRTIO_ID_MEMORY_BALLOON	13 /* virtio memory balloon */
+#define VIRTIO_ID_GPU			16 /* virtio GPU */
+#define VIRTIO_ID_CLOCK			17 /* virtio clock/timer */
+#define VIRTIO_ID_INPUT			18 /* virtio input */
+#define VIRTIO_ID_VSOCK			19 /* virtio vsock transport */
+#define VIRTIO_ID_CRYPTO		20 /* virtio crypto */
+#define VIRTIO_ID_SIGNAL_DIST		21 /* virtio signal distribution device */
+#define VIRTIO_ID_PSTORE		22 /* virtio pstore device */
+#define VIRTIO_ID_IOMMU			23 /* virtio IOMMU */
+#define VIRTIO_ID_MEM			24 /* virtio mem */
+#define VIRTIO_ID_FS			26 /* virtio filesystem */
+#define VIRTIO_ID_PMEM			27 /* virtio pmem */
+#define VIRTIO_ID_MAC80211_HWSIM	29 /* virtio mac80211-hwsim */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index a72de1ae4cb5..b6a0eaa32ae6 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -156,9 +156,6 @@ struct kvm_sync_regs {
 	__u64 device_irq_level;
 };
 
-struct kvm_arch_memory_slot {
-};
-
 /*
  * PMU filter structure. Describe a range of events with a particular
  * action. To be used with KVM_ARM_VCPU_PMU_V3_FILTER.
diff --git a/linux-headers/asm-generic/unistd.h b/linux-headers/asm-generic/unistd.h
index 2056318988f7..728752917785 100644
--- a/linux-headers/asm-generic/unistd.h
+++ b/linux-headers/asm-generic/unistd.h
@@ -517,7 +517,7 @@ __SC_COMP(__NR_settimeofday, sys_settimeofday, compat_sys_settimeofday)
 __SC_3264(__NR_adjtimex, sys_adjtimex_time32, sys_adjtimex)
 #endif
 
-/* kernel/timer.c */
+/* kernel/sys.c */
 #define __NR_getpid 172
 __SYSCALL(__NR_getpid, sys_getpid)
 #define __NR_getppid 173
@@ -859,9 +859,11 @@ __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
 #define __NR_process_madvise 440
 __SYSCALL(__NR_process_madvise, sys_process_madvise)
+#define __NR_epoll_pwait2 441
+__SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)
 
 #undef __NR_syscalls
-#define __NR_syscalls 441
+#define __NR_syscalls 442
 
 /*
  * 32 bit systems traditionally used different
diff --git a/linux-headers/asm-mips/unistd_n32.h b/linux-headers/asm-mips/unistd_n32.h
index aba284d190a0..59e53b6e076f 100644
--- a/linux-headers/asm-mips/unistd_n32.h
+++ b/linux-headers/asm-mips/unistd_n32.h
@@ -370,6 +370,7 @@
 #define __NR_pidfd_getfd	(__NR_Linux + 438)
 #define __NR_faccessat2	(__NR_Linux + 439)
 #define __NR_process_madvise	(__NR_Linux + 440)
+#define __NR_epoll_pwait2	(__NR_Linux + 441)
 
 
 #endif /* _ASM_MIPS_UNISTD_N32_H */
diff --git a/linux-headers/asm-mips/unistd_n64.h b/linux-headers/asm-mips/unistd_n64.h
index 0465ab94db89..683558a7f8ad 100644
--- a/linux-headers/asm-mips/unistd_n64.h
+++ b/linux-headers/asm-mips/unistd_n64.h
@@ -346,6 +346,7 @@
 #define __NR_pidfd_getfd	(__NR_Linux + 438)
 #define __NR_faccessat2	(__NR_Linux + 439)
 #define __NR_process_madvise	(__NR_Linux + 440)
+#define __NR_epoll_pwait2	(__NR_Linux + 441)
 
 
 #endif /* _ASM_MIPS_UNISTD_N64_H */
diff --git a/linux-headers/asm-mips/unistd_o32.h b/linux-headers/asm-mips/unistd_o32.h
index 5222a0dd50e1..ca6a7e5c0b91 100644
--- a/linux-headers/asm-mips/unistd_o32.h
+++ b/linux-headers/asm-mips/unistd_o32.h
@@ -416,6 +416,7 @@
 #define __NR_pidfd_getfd	(__NR_Linux + 438)
 #define __NR_faccessat2	(__NR_Linux + 439)
 #define __NR_process_madvise	(__NR_Linux + 440)
+#define __NR_epoll_pwait2	(__NR_Linux + 441)
 
 
 #endif /* _ASM_MIPS_UNISTD_O32_H */
diff --git a/linux-headers/asm-powerpc/unistd_32.h b/linux-headers/asm-powerpc/unistd_32.h
index 21066a3d5f4a..4624c9004368 100644
--- a/linux-headers/asm-powerpc/unistd_32.h
+++ b/linux-headers/asm-powerpc/unistd_32.h
@@ -423,6 +423,7 @@
 #define __NR_pidfd_getfd	438
 #define __NR_faccessat2	439
 #define __NR_process_madvise	440
+#define __NR_epoll_pwait2	441
 
 
 #endif /* _ASM_POWERPC_UNISTD_32_H */
diff --git a/linux-headers/asm-powerpc/unistd_64.h b/linux-headers/asm-powerpc/unistd_64.h
index c153da29f236..7e851b30bb13 100644
--- a/linux-headers/asm-powerpc/unistd_64.h
+++ b/linux-headers/asm-powerpc/unistd_64.h
@@ -395,6 +395,7 @@
 #define __NR_pidfd_getfd	438
 #define __NR_faccessat2	439
 #define __NR_process_madvise	440
+#define __NR_epoll_pwait2	441
 
 
 #endif /* _ASM_POWERPC_UNISTD_64_H */
diff --git a/linux-headers/asm-s390/unistd_32.h b/linux-headers/asm-s390/unistd_32.h
index 3b4f2dda6049..c94d2c3a22d6 100644
--- a/linux-headers/asm-s390/unistd_32.h
+++ b/linux-headers/asm-s390/unistd_32.h
@@ -413,5 +413,6 @@
 #define __NR_pidfd_getfd 438
 #define __NR_faccessat2 439
 #define __NR_process_madvise 440
+#define __NR_epoll_pwait2 441
 
 #endif /* _ASM_S390_UNISTD_32_H */
diff --git a/linux-headers/asm-s390/unistd_64.h b/linux-headers/asm-s390/unistd_64.h
index 030a51fa3828..984a06b7ebe4 100644
--- a/linux-headers/asm-s390/unistd_64.h
+++ b/linux-headers/asm-s390/unistd_64.h
@@ -361,5 +361,6 @@
 #define __NR_pidfd_getfd 438
 #define __NR_faccessat2 439
 #define __NR_process_madvise 440
+#define __NR_epoll_pwait2 441
 
 #endif /* _ASM_S390_UNISTD_64_H */
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index 89e5f3d1bba8..8e76d3701db3 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -12,6 +12,7 @@
 
 #define KVM_PIO_PAGE_OFFSET 1
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 2
+#define KVM_DIRTY_LOG_PAGE_OFFSET 64
 
 #define DE_VECTOR 0
 #define DB_VECTOR 1
diff --git a/linux-headers/asm-x86/unistd_32.h b/linux-headers/asm-x86/unistd_32.h
index cfba368f9dff..18fb99dfa287 100644
--- a/linux-headers/asm-x86/unistd_32.h
+++ b/linux-headers/asm-x86/unistd_32.h
@@ -431,6 +431,7 @@
 #define __NR_pidfd_getfd 438
 #define __NR_faccessat2 439
 #define __NR_process_madvise 440
+#define __NR_epoll_pwait2 441
 
 
 #endif /* _ASM_X86_UNISTD_32_H */
diff --git a/linux-headers/asm-x86/unistd_64.h b/linux-headers/asm-x86/unistd_64.h
index 61af7250955f..bde959328d65 100644
--- a/linux-headers/asm-x86/unistd_64.h
+++ b/linux-headers/asm-x86/unistd_64.h
@@ -353,6 +353,7 @@
 #define __NR_pidfd_getfd 438
 #define __NR_faccessat2 439
 #define __NR_process_madvise 440
+#define __NR_epoll_pwait2 441
 
 
 #endif /* _ASM_X86_UNISTD_64_H */
diff --git a/linux-headers/asm-x86/unistd_x32.h b/linux-headers/asm-x86/unistd_x32.h
index a6890cb1f5b5..4ff6b17d3bb4 100644
--- a/linux-headers/asm-x86/unistd_x32.h
+++ b/linux-headers/asm-x86/unistd_x32.h
@@ -306,6 +306,7 @@
 #define __NR_pidfd_getfd (__X32_SYSCALL_BIT + 438)
 #define __NR_faccessat2 (__X32_SYSCALL_BIT + 439)
 #define __NR_process_madvise (__X32_SYSCALL_BIT + 440)
+#define __NR_epoll_pwait2 (__X32_SYSCALL_BIT + 441)
 #define __NR_rt_sigaction (__X32_SYSCALL_BIT + 512)
 #define __NR_rt_sigreturn (__X32_SYSCALL_BIT + 513)
 #define __NR_ioctl (__X32_SYSCALL_BIT + 514)
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 56ce14ad209f..020b62a619a7 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -250,6 +250,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_ARM_NISV         28
 #define KVM_EXIT_X86_RDMSR        29
 #define KVM_EXIT_X86_WRMSR        30
+#define KVM_EXIT_DIRTY_RING_FULL  31
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -1053,6 +1054,8 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_X86_USER_SPACE_MSR 188
 #define KVM_CAP_X86_MSR_FILTER 189
 #define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190
+#define KVM_CAP_SYS_HYPERV_CPUID 191
+#define KVM_CAP_DIRTY_LOG_RING 192
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1511,7 +1514,7 @@ struct kvm_enc_region {
 /* Available with KVM_CAP_MANUAL_DIRTY_LOG_PROTECT_2 */
 #define KVM_CLEAR_DIRTY_LOG          _IOWR(KVMIO, 0xc0, struct kvm_clear_dirty_log)
 
-/* Available with KVM_CAP_HYPERV_CPUID */
+/* Available with KVM_CAP_HYPERV_CPUID (vcpu) / KVM_CAP_SYS_HYPERV_CPUID (system) */
 #define KVM_GET_SUPPORTED_HV_CPUID _IOWR(KVMIO, 0xc1, struct kvm_cpuid2)
 
 /* Available with KVM_CAP_ARM_SVE */
@@ -1557,6 +1560,9 @@ struct kvm_pv_cmd {
 /* Available with KVM_CAP_X86_MSR_FILTER */
 #define KVM_X86_SET_MSR_FILTER	_IOW(KVMIO,  0xc6, struct kvm_msr_filter)
 
+/* Available with KVM_CAP_DIRTY_LOG_RING */
+#define KVM_RESET_DIRTY_RINGS		_IO(KVMIO, 0xc7)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
@@ -1710,4 +1716,52 @@ struct kvm_hyperv_eventfd {
 #define KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE    (1 << 0)
 #define KVM_DIRTY_LOG_INITIALLY_SET            (1 << 1)
 
+/*
+ * Arch needs to define the macro after implementing the dirty ring
+ * feature.  KVM_DIRTY_LOG_PAGE_OFFSET should be defined as the
+ * starting page offset of the dirty ring structures.
+ */
+#ifndef KVM_DIRTY_LOG_PAGE_OFFSET
+#define KVM_DIRTY_LOG_PAGE_OFFSET 0
+#endif
+
+/*
+ * KVM dirty GFN flags, defined as:
+ *
+ * |---------------+---------------+--------------|
+ * | bit 1 (reset) | bit 0 (dirty) | Status       |
+ * |---------------+---------------+--------------|
+ * |             0 |             0 | Invalid GFN  |
+ * |             0 |             1 | Dirty GFN    |
+ * |             1 |             X | GFN to reset |
+ * |---------------+---------------+--------------|
+ *
+ * Lifecycle of a dirty GFN goes like:
+ *
+ *      dirtied         harvested        reset
+ * 00 -----------> 01 -------------> 1X -------+
+ *  ^                                          |
+ *  |                                          |
+ *  +------------------------------------------+
+ *
+ * The userspace program is only responsible for the 01->1X state
+ * conversion after harvesting an entry.  Also, it must not skip any
+ * dirty bits, so that dirty bits are always harvested in sequence.
+ */
+#define KVM_DIRTY_GFN_F_DIRTY           BIT(0)
+#define KVM_DIRTY_GFN_F_RESET           BIT(1)
+#define KVM_DIRTY_GFN_F_MASK            0x3
+
+/*
+ * KVM dirty rings should be mapped at KVM_DIRTY_LOG_PAGE_OFFSET of
+ * per-vcpu mmaped regions as an array of struct kvm_dirty_gfn.  The
+ * size of the gfn buffer is decided by the first argument when
+ * enabling KVM_CAP_DIRTY_LOG_RING.
+ */
+struct kvm_dirty_gfn {
+	__u32 flags;
+	__u32 slot;
+	__u64 offset;
+};
+
 #endif /* __LINUX_KVM_H */
diff --git a/linux-headers/linux/userfaultfd.h b/linux-headers/linux/userfaultfd.h
index 8d3996eb8285..1ba9a9feeb83 100644
--- a/linux-headers/linux/userfaultfd.h
+++ b/linux-headers/linux/userfaultfd.h
@@ -257,4 +257,13 @@ struct uffdio_writeprotect {
 	__u64 mode;
 };
 
+/*
+ * Flags for the userfaultfd(2) system call itself.
+ */
+
+/*
+ * Create a userfaultfd that can handle page faults only in user mode.
+ */
+#define UFFD_USER_MODE_ONLY 1
+
 #endif /* _LINUX_USERFAULTFD_H */
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index b92dcc4dafd5..609099e455cd 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -820,6 +820,7 @@ enum {
 enum {
 	VFIO_CCW_IO_IRQ_INDEX,
 	VFIO_CCW_CRW_IRQ_INDEX,
+	VFIO_CCW_REQ_IRQ_INDEX,
 	VFIO_CCW_NUM_IRQS
 };
 
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 75232185324a..c998860d7bbc 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -146,4 +146,8 @@
 
 /* Set event fd for config interrupt*/
 #define VHOST_VDPA_SET_CONFIG_CALL	_IOW(VHOST_VIRTIO, 0x77, int)
+
+/* Get the valid iova range */
+#define VHOST_VDPA_GET_IOVA_RANGE	_IOR(VHOST_VIRTIO, 0x78, \
+					     struct vhost_vdpa_iova_range)
 #endif
diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 9efbaf2f84b3..1dc11af241b6 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -42,6 +42,7 @@ cp_portable() {
                                      -e 'drm.h' \
                                      -e 'limits' \
                                      -e 'linux/kernel' \
+                                     -e 'linux/const' \
                                      -e 'linux/sysinfo' \
                                      -e 'asm-generic/kvm_para' \
                                      > /dev/null
@@ -190,7 +191,9 @@ for i in "$tmpdir"/include/linux/*virtio*.h \
          "$tmpdir/include/linux/input.h" \
          "$tmpdir/include/linux/input-event-codes.h" \
          "$tmpdir/include/linux/pci_regs.h" \
-         "$tmpdir/include/linux/ethtool.h" "$tmpdir/include/linux/kernel.h" \
+         "$tmpdir/include/linux/ethtool.h" \
+         "$tmpdir/include/linux/kernel.h" \
+         "$tmpdir/include/linux/const.h" \
          "$tmpdir/include/linux/vhost_types.h" \
          "$tmpdir/include/linux/sysinfo.h"; do
     cp_portable "$i" "$output/include/standard-headers/linux"
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 02/19] i386: introduce kvm_hv_evmcs_available()
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 01/19] linux-headers: update against 5.11-rc2 Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 03/19] i386: keep hyperv_vendor string up-to-date Vitaly Kuznetsov
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

Enlightened VMCS feature is hardware specific, it is only supported on
Intel CPUs. Introduce a simple kvm_hv_evmcs_available() helper, it will
be used to filter out 'hv_evmcs' when 'hyperv=on' option is added to
X86MachineClass.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm-stub.c | 5 +++++
 target/i386/kvm/kvm.c      | 8 ++++++++
 target/i386/kvm/kvm_i386.h | 1 +
 3 files changed, 14 insertions(+)

diff --git a/target/i386/kvm/kvm-stub.c b/target/i386/kvm/kvm-stub.c
index 92f49121b8fa..0a163ae207c5 100644
--- a/target/i386/kvm/kvm-stub.c
+++ b/target/i386/kvm/kvm-stub.c
@@ -39,3 +39,8 @@ bool kvm_hv_vpindex_settable(void)
 {
     return false;
 }
+
+bool kvm_hv_evmcs_available(void)
+{
+    return false;
+}
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 6dc1ee052d5f..edaaed56c6e2 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -95,6 +95,7 @@ static bool has_msr_hv_crash;
 static bool has_msr_hv_reset;
 static bool has_msr_hv_vpindex;
 static bool hv_vpindex_settable;
+static bool hv_evmcs_available;
 static bool has_msr_hv_runtime;
 static bool has_msr_hv_synic;
 static bool has_msr_hv_stimer;
@@ -192,6 +193,11 @@ bool kvm_hv_vpindex_settable(void)
     return hv_vpindex_settable;
 }
 
+bool kvm_hv_evmcs_available(void)
+{
+    return hv_evmcs_available;
+}
+
 static int kvm_get_tsc(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
@@ -2146,6 +2152,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     has_pit_state2 = kvm_check_extension(s, KVM_CAP_PIT_STATE2);
 
     hv_vpindex_settable = kvm_check_extension(s, KVM_CAP_HYPERV_VP_INDEX);
+    hv_evmcs_available =
+        kvm_check_extension(s, KVM_CAP_HYPERV_ENLIGHTENED_VMCS);
 
     has_exception_payload = kvm_check_extension(s, KVM_CAP_EXCEPTION_PAYLOAD);
     if (has_exception_payload) {
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index dc725083891c..08968cfb33f1 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -47,6 +47,7 @@ bool kvm_has_x2apic_api(void);
 bool kvm_has_waitpkg(void);
 
 bool kvm_hv_vpindex_settable(void);
+bool kvm_hv_evmcs_available(void);
 
 uint64_t kvm_swizzle_msi_ext_dest_id(uint64_t address);
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 03/19] i386: keep hyperv_vendor string up-to-date
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 01/19] linux-headers: update against 5.11-rc2 Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 02/19] i386: introduce kvm_hv_evmcs_available() Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 04/19] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough Vitaly Kuznetsov
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

When cpu->hyperv_vendor is not set manually we default to "Microsoft Hv"
and in 'hv_passthrough' mode we get the information from the host. This
information is stored in cpu->hyperv_vendor_id[] array but we don't update
cpu->hyperv_vendor string so e.g. QMP's query-cpu-model-expansion output
is incorrect.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c     | 19 +++++++++----------
 target/i386/kvm/kvm.c |  4 ++++
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 35459a38bb1c..606474e5c9ca 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6549,17 +6549,16 @@ static void x86_cpu_hyperv_realize(X86CPU *cpu)
 
     /* Hyper-V vendor id */
     if (!cpu->hyperv_vendor) {
-        memcpy(cpu->hyperv_vendor_id, "Microsoft Hv", 12);
-    } else {
-        len = strlen(cpu->hyperv_vendor);
-
-        if (len > 12) {
-            warn_report("hv-vendor-id truncated to 12 characters");
-            len = 12;
-        }
-        memset(cpu->hyperv_vendor_id, 0, 12);
-        memcpy(cpu->hyperv_vendor_id, cpu->hyperv_vendor, len);
+        object_property_set_str(OBJECT(cpu), "hv-vendor-id", "Microsoft Hv",
+                                &error_abort);
+    }
+    len = strlen(cpu->hyperv_vendor);
+    if (len > 12) {
+        warn_report("hv-vendor-id truncated to 12 characters");
+        len = 12;
     }
+    memset(cpu->hyperv_vendor_id, 0, 12);
+    memcpy(cpu->hyperv_vendor_id, cpu->hyperv_vendor, len);
 
     /* 'Hv#1' interface identification*/
     cpu->hyperv_interface_id[0] = 0x31237648;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index edaaed56c6e2..07a3729b0dee 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1218,6 +1218,10 @@ static int hyperv_handle_properties(CPUState *cs,
             cpu->hyperv_vendor_id[0] = c->ebx;
             cpu->hyperv_vendor_id[1] = c->ecx;
             cpu->hyperv_vendor_id[2] = c->edx;
+            cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
+                                           sizeof(cpu->hyperv_vendor_id) + 1);
+            memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
+                   sizeof(cpu->hyperv_vendor_id));
         }
 
         c = cpuid_find_entry(cpuid, HV_CPUID_INTERFACE, 0);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 04/19] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2021-01-07 15:06 ` [PATCH v3 03/19] i386: keep hyperv_vendor string up-to-date Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 05/19] i386: always fill Hyper-V CPUID feature leaves from X86CPU data Vitaly Kuznetsov
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

There is no need to have this special case: like all other Hyper-V
enlightenments we can just use kernel's supplied value in hv_passthrough
mode.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 07a3729b0dee..e50b9cac2494 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1258,11 +1258,7 @@ static int hyperv_handle_properties(CPUState *cs,
         c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
         if (c) {
             env->features[FEAT_HV_RECOMM_EAX] = c->eax;
-
-            /* hv-spinlocks may have been overriden */
-            if (cpu->hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_NOTIFY) {
-                c->ebx = cpu->hyperv_spinlock_attempts;
-            }
+            cpu->hyperv_spinlock_attempts = c->ebx;
         }
         c = cpuid_find_entry(cpuid, HV_CPUID_NESTED_FEATURES, 0);
         if (c) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 05/19] i386: always fill Hyper-V CPUID feature leaves from X86CPU data
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2021-01-07 15:06 ` [PATCH v3 04/19] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 06/19] i386: stop using env->features[] for filling Hyper-V CPUIDs Vitaly Kuznetsov
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

We have all the required data in X86CPU already and as we are about to
split hyperv_handle_properties() into hyperv_expand_features()/
hyperv_fill_cpuids() we can remove the blind copy. The functional change
is that QEMU won't pass CPUID leaves it doesn't currently know about
to the guest but arguably this is a good change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e50b9cac2494..4a85d62bdaad 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1210,9 +1210,6 @@ static int hyperv_handle_properties(CPUState *cs,
     }
 
     if (cpu->hyperv_passthrough) {
-        memcpy(cpuid_ent, &cpuid->entries[0],
-               cpuid->nent * sizeof(cpuid->entries[0]));
-
         c = cpuid_find_entry(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, 0);
         if (c) {
             cpu->hyperv_vendor_id[0] = c->ebx;
@@ -1311,12 +1308,6 @@ static int hyperv_handle_properties(CPUState *cs,
         goto free;
     }
 
-    if (cpu->hyperv_passthrough) {
-        /* We already copied all feature words from KVM as is */
-        r = cpuid->nent;
-        goto free;
-    }
-
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
     c->eax = hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ?
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 06/19] i386: stop using env->features[] for filling Hyper-V CPUIDs
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2021-01-07 15:06 ` [PATCH v3 05/19] i386: always fill Hyper-V CPUID feature leaves from X86CPU data Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 07/19] i386: introduce hyperv_feature_supported() Vitaly Kuznetsov
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

As a preparatory patch to dropping Hyper-V CPUID leaves from
feature_word_info[] stop using env->features[] as a temporary
storage of Hyper-V CPUIDs, just build Hyper-V CPUID leaves directly
from kvm_hyperv_properties[] data.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.h     |  1 +
 target/i386/kvm/kvm.c | 80 +++++++++++++++++++++++--------------------
 2 files changed, 43 insertions(+), 38 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index af130512e220..a9fab5adbdfb 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1667,6 +1667,7 @@ struct X86CPU {
     uint32_t hyperv_interface_id[4];
     uint32_t hyperv_version_id[4];
     uint32_t hyperv_limits[3];
+    uint32_t hyperv_nested[4];
 
     bool check_cpuid;
     bool enforce_cpuid;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4a85d62bdaad..768e08fa5e8f 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1114,7 +1114,6 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
                                   int feature)
 {
     X86CPU *cpu = X86_CPU(cs);
-    CPUX86State *env = &cpu->env;
     uint32_t r, fw, bits;
     uint64_t deps;
     int i, dep_feat;
@@ -1154,8 +1153,6 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
                 return 0;
             }
         }
-
-        env->features[fw] |= bits;
     }
 
     if (cpu->hyperv_passthrough) {
@@ -1165,6 +1162,29 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
     return 0;
 }
 
+static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t fw)
+{
+    X86CPU *cpu = X86_CPU(cs);
+    uint32_t r = 0;
+    int i, j;
+
+    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties); i++) {
+        if (!hyperv_feat_enabled(cpu, i)) {
+            continue;
+        }
+
+        for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
+            if (kvm_hyperv_properties[i].flags[j].fw != fw) {
+                continue;
+            }
+
+            r |= kvm_hyperv_properties[i].flags[j].bits;
+        }
+    }
+
+    return r;
+}
+
 /*
  * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent in
  * case of success, errno < 0 in case of failure and 0 when no Hyper-V
@@ -1174,9 +1194,8 @@ static int hyperv_handle_properties(CPUState *cs,
                                     struct kvm_cpuid_entry2 *cpuid_ent)
 {
     X86CPU *cpu = X86_CPU(cs);
-    CPUX86State *env = &cpu->env;
     struct kvm_cpuid2 *cpuid;
-    struct kvm_cpuid_entry2 *c;
+    struct kvm_cpuid_entry2 *c, *c2;
     uint32_t cpuid_i = 0;
     int r;
 
@@ -1197,9 +1216,7 @@ static int hyperv_handle_properties(CPUState *cs,
         }
 
         if (!r) {
-            env->features[FEAT_HV_RECOMM_EAX] |=
-                HV_ENLIGHTENED_VMCS_RECOMMENDED;
-            env->features[FEAT_HV_NESTED_EAX] = evmcs_version;
+            cpu->hyperv_nested[0] = evmcs_version;
         }
     }
 
@@ -1237,13 +1254,6 @@ static int hyperv_handle_properties(CPUState *cs,
             cpu->hyperv_version_id[3] = c->edx;
         }
 
-        c = cpuid_find_entry(cpuid, HV_CPUID_FEATURES, 0);
-        if (c) {
-            env->features[FEAT_HYPERV_EAX] = c->eax;
-            env->features[FEAT_HYPERV_EBX] = c->ebx;
-            env->features[FEAT_HYPERV_EDX] = c->edx;
-        }
-
         c = cpuid_find_entry(cpuid, HV_CPUID_IMPLEMENT_LIMITS, 0);
         if (c) {
             cpu->hv_max_vps = c->eax;
@@ -1254,23 +1264,8 @@ static int hyperv_handle_properties(CPUState *cs,
 
         c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
         if (c) {
-            env->features[FEAT_HV_RECOMM_EAX] = c->eax;
             cpu->hyperv_spinlock_attempts = c->ebx;
         }
-        c = cpuid_find_entry(cpuid, HV_CPUID_NESTED_FEATURES, 0);
-        if (c) {
-            env->features[FEAT_HV_NESTED_EAX] = c->eax;
-        }
-    }
-
-    if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
-        env->features[FEAT_HV_RECOMM_EAX] |= HV_NO_NONARCH_CORESHARING;
-    } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
-        c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
-        if (c) {
-            env->features[FEAT_HV_RECOMM_EAX] |=
-                c->eax & HV_NO_NONARCH_CORESHARING;
-        }
     }
 
     /* Features */
@@ -1300,9 +1295,6 @@ static int hyperv_handle_properties(CPUState *cs,
         r |= 1;
     }
 
-    /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
-    env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
-
     if (r) {
         r = -ENOSYS;
         goto free;
@@ -1332,15 +1324,27 @@ static int hyperv_handle_properties(CPUState *cs,
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_FEATURES;
-    c->eax = env->features[FEAT_HYPERV_EAX];
-    c->ebx = env->features[FEAT_HYPERV_EBX];
-    c->edx = env->features[FEAT_HYPERV_EDX];
+    c->eax = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EAX);
+    c->ebx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EBX);
+    c->edx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EDX);
+
+    /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
+    c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_ENLIGHTMENT_INFO;
-    c->eax = env->features[FEAT_HV_RECOMM_EAX];
+    c->eax = hv_build_cpuid_leaf(cs, FEAT_HV_RECOMM_EAX);
     c->ebx = cpu->hyperv_spinlock_attempts;
 
+    if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
+        c->eax |= HV_NO_NONARCH_CORESHARING;
+    } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
+        c2 = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
+        if (c2) {
+            c->eax |= c2->eax & HV_NO_NONARCH_CORESHARING;
+        }
+    }
+
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_IMPLEMENT_LIMITS;
     c->eax = cpu->hv_max_vps;
@@ -1360,7 +1364,7 @@ static int hyperv_handle_properties(CPUState *cs,
 
         c = &cpuid_ent[cpuid_i++];
         c->function = HV_CPUID_NESTED_FEATURES;
-        c->eax = env->features[FEAT_HV_NESTED_EAX];
+        c->eax = cpu->hyperv_nested[0];
     }
     r = cpuid_i;
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 07/19] i386: introduce hyperv_feature_supported()
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2021-01-07 15:06 ` [PATCH v3 06/19] i386: stop using env->features[] for filling Hyper-V CPUIDs Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:06 ` [PATCH v3 08/19] i386: introduce hv_cpuid_get_host() Vitaly Kuznetsov
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

Clean up hv_cpuid_check_and_set() by separating hyperv_feature_supported()
off it. No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 49 ++++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 768e08fa5e8f..5472d78f5d73 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1110,13 +1110,33 @@ static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
     return 0;
 }
 
+static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
+{
+    uint32_t r, fw, bits;
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
+        fw = kvm_hyperv_properties[feature].flags[i].fw;
+        bits = kvm_hyperv_properties[feature].flags[i].bits;
+
+        if (!fw) {
+            continue;
+        }
+
+        if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
 static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
                                   int feature)
 {
     X86CPU *cpu = X86_CPU(cs);
-    uint32_t r, fw, bits;
     uint64_t deps;
-    int i, dep_feat;
+    int dep_feat;
 
     if (!hyperv_feat_enabled(cpu, feature) && !cpu->hyperv_passthrough) {
         return 0;
@@ -1135,23 +1155,14 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
         deps &= ~(1ull << dep_feat);
     }
 
-    for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
-        fw = kvm_hyperv_properties[feature].flags[i].fw;
-        bits = kvm_hyperv_properties[feature].flags[i].bits;
-
-        if (!fw) {
-            continue;
-        }
-
-        if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
-            if (hyperv_feat_enabled(cpu, feature)) {
-                fprintf(stderr,
-                        "Hyper-V %s is not supported by kernel\n",
-                        kvm_hyperv_properties[feature].desc);
-                return 1;
-            } else {
-                return 0;
-            }
+    if (!hyperv_feature_supported(cpuid, feature)) {
+        if (hyperv_feat_enabled(cpu, feature)) {
+            fprintf(stderr,
+                    "Hyper-V %s is not supported by kernel\n",
+                    kvm_hyperv_properties[feature].desc);
+            return 1;
+        } else {
+            return 0;
         }
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 08/19] i386: introduce hv_cpuid_get_host()
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2021-01-07 15:06 ` [PATCH v3 07/19] i386: introduce hyperv_feature_supported() Vitaly Kuznetsov
@ 2021-01-07 15:06 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 09/19] i386: drop FEAT_HYPERV feature leaves Vitaly Kuznetsov
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

As a preparation to implementing hv_cpuid_cache intro introduce
hv_cpuid_get_host(). No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 100 +++++++++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 44 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 5472d78f5d73..2a37bdc45d17 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1110,6 +1110,19 @@ static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
     return 0;
 }
 
+static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
+                                  int reg)
+{
+    struct kvm_cpuid_entry2 *entry;
+
+    entry = cpuid_find_entry(cpuid, func, 0);
+    if (!entry) {
+        return 0;
+    }
+
+    return cpuid_entry_get_reg(entry, reg);
+}
+
 static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
 {
     uint32_t r, fw, bits;
@@ -1206,7 +1219,7 @@ static int hyperv_handle_properties(CPUState *cs,
 {
     X86CPU *cpu = X86_CPU(cs);
     struct kvm_cpuid2 *cpuid;
-    struct kvm_cpuid_entry2 *c, *c2;
+    struct kvm_cpuid_entry2 *c;
     uint32_t cpuid_i = 0;
     int r;
 
@@ -1238,45 +1251,46 @@ static int hyperv_handle_properties(CPUState *cs,
     }
 
     if (cpu->hyperv_passthrough) {
-        c = cpuid_find_entry(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, 0);
-        if (c) {
-            cpu->hyperv_vendor_id[0] = c->ebx;
-            cpu->hyperv_vendor_id[1] = c->ecx;
-            cpu->hyperv_vendor_id[2] = c->edx;
-            cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
-                                           sizeof(cpu->hyperv_vendor_id) + 1);
-            memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
-                   sizeof(cpu->hyperv_vendor_id));
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_INTERFACE, 0);
-        if (c) {
-            cpu->hyperv_interface_id[0] = c->eax;
-            cpu->hyperv_interface_id[1] = c->ebx;
-            cpu->hyperv_interface_id[2] = c->ecx;
-            cpu->hyperv_interface_id[3] = c->edx;
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_VERSION, 0);
-        if (c) {
-            cpu->hyperv_version_id[0] = c->eax;
-            cpu->hyperv_version_id[1] = c->ebx;
-            cpu->hyperv_version_id[2] = c->ecx;
-            cpu->hyperv_version_id[3] = c->edx;
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_IMPLEMENT_LIMITS, 0);
-        if (c) {
-            cpu->hv_max_vps = c->eax;
-            cpu->hyperv_limits[0] = c->ebx;
-            cpu->hyperv_limits[1] = c->ecx;
-            cpu->hyperv_limits[2] = c->edx;
-        }
-
-        c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
-        if (c) {
-            cpu->hyperv_spinlock_attempts = c->ebx;
-        }
+        cpu->hyperv_vendor_id[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
+        cpu->hyperv_vendor_id[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_ECX);
+        cpu->hyperv_vendor_id[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EDX);
+        cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
+                                       sizeof(cpu->hyperv_vendor_id) + 1);
+        memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
+               sizeof(cpu->hyperv_vendor_id));
+
+        cpu->hyperv_interface_id[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EAX);
+        cpu->hyperv_interface_id[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EBX);
+        cpu->hyperv_interface_id[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_ECX);
+        cpu->hyperv_interface_id[3] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EDX);
+
+        cpu->hyperv_version_id[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EAX);
+        cpu->hyperv_version_id[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EBX);
+        cpu->hyperv_version_id[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_ECX);
+        cpu->hyperv_version_id[3] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EDX);
+
+        cpu->hv_max_vps = hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS,
+                                            R_EAX);
+        cpu->hyperv_limits[0] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EBX);
+        cpu->hyperv_limits[1] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_ECX);
+        cpu->hyperv_limits[2] =
+            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EDX);
+
+        cpu->hyperv_spinlock_attempts =
+            hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EBX);
     }
 
     /* Features */
@@ -1350,10 +1364,8 @@ static int hyperv_handle_properties(CPUState *cs,
     if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
         c->eax |= HV_NO_NONARCH_CORESHARING;
     } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
-        c2 = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
-        if (c2) {
-            c->eax |= c2->eax & HV_NO_NONARCH_CORESHARING;
-        }
+        c->eax |= hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EAX) &
+            HV_NO_NONARCH_CORESHARING;
     }
 
     c = &cpuid_ent[cpuid_i++];
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 09/19] i386: drop FEAT_HYPERV feature leaves
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (7 preceding siblings ...)
  2021-01-07 15:06 ` [PATCH v3 08/19] i386: introduce hv_cpuid_get_host() Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 10/19] i386: introduce hv_cpuid_cache Vitaly Kuznetsov
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

Hyper-V feature leaves are weird. We have some of them in
feature_word_info[] array but we don't use feature_word_info
magic to enable them. Neither do we use feature_dependencies[]
mechanism to validate the configuration as it doesn't allign
well with Hyper-V's many-to-many dependency chains. Some of
the feature leaves hold not only feature bits, but also values.
E.g. FEAT_HV_NESTED_EAX contains both features and the supported
Enlightened VMCS range.

Hyper-V features are already represented in 'struct X86CPU' with
uint64_t hyperv_features so duplicating them in env->features adds
little (or zero) benefits. THe other half of Hyper-V emulation features
is also stored with values in hyperv_vendor_id[], hyperv_limits[],...
so env->features[] is already incomplete.

Remove Hyper-V feature leaves from env->features[] completely.
kvm_hyperv_properties[] is converted to using raw CPUID func/reg
pairs for features, this allows us to get rid of hv_cpuid_get_fw()
conversion.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c     |  90 +----------------------------------
 target/i386/cpu.h     |   5 --
 target/i386/kvm/kvm.c | 108 ++++++++++++++----------------------------
 3 files changed, 37 insertions(+), 166 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 606474e5c9ca..9f6cabfc7787 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -832,94 +832,6 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
          */
         .no_autoenable_flags = ~0U,
     },
-    /*
-     * .feat_names are commented out for Hyper-V enlightenments because we
-     * don't want to have two different ways for enabling them on QEMU command
-     * line. Some features (e.g. "hyperv_time", "hyperv_vapic", ...) require
-     * enabling several feature bits simultaneously, exposing these bits
-     * individually may just confuse guests.
-     */
-    [FEAT_HYPERV_EAX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_msr_vp_runtime_access */, NULL /* hv_msr_time_refcount_access */,
-            NULL /* hv_msr_synic_access */, NULL /* hv_msr_stimer_access */,
-            NULL /* hv_msr_apic_access */, NULL /* hv_msr_hypercall_access */,
-            NULL /* hv_vpindex_access */, NULL /* hv_msr_reset_access */,
-            NULL /* hv_msr_stats_access */, NULL /* hv_reftsc_access */,
-            NULL /* hv_msr_idle_access */, NULL /* hv_msr_frequency_access */,
-            NULL /* hv_msr_debug_access */, NULL /* hv_msr_reenlightenment_access */,
-            NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000003, .reg = R_EAX, },
-    },
-    [FEAT_HYPERV_EBX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_create_partitions */, NULL /* hv_access_partition_id */,
-            NULL /* hv_access_memory_pool */, NULL /* hv_adjust_message_buffers */,
-            NULL /* hv_post_messages */, NULL /* hv_signal_events */,
-            NULL /* hv_create_port */, NULL /* hv_connect_port */,
-            NULL /* hv_access_stats */, NULL, NULL, NULL /* hv_debugging */,
-            NULL /* hv_cpu_power_management */, NULL /* hv_configure_profiler */,
-            NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000003, .reg = R_EBX, },
-    },
-    [FEAT_HYPERV_EDX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_mwait */, NULL /* hv_guest_debugging */,
-            NULL /* hv_perf_monitor */, NULL /* hv_cpu_dynamic_part */,
-            NULL /* hv_hypercall_params_xmm */, NULL /* hv_guest_idle_state */,
-            NULL, NULL,
-            NULL, NULL, NULL /* hv_guest_crash_msr */, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000003, .reg = R_EDX, },
-    },
-    [FEAT_HV_RECOMM_EAX] = {
-        .type = CPUID_FEATURE_WORD,
-        .feat_names = {
-            NULL /* hv_recommend_pv_as_switch */,
-            NULL /* hv_recommend_pv_tlbflush_local */,
-            NULL /* hv_recommend_pv_tlbflush_remote */,
-            NULL /* hv_recommend_msr_apic_access */,
-            NULL /* hv_recommend_msr_reset */,
-            NULL /* hv_recommend_relaxed_timing */,
-            NULL /* hv_recommend_dma_remapping */,
-            NULL /* hv_recommend_int_remapping */,
-            NULL /* hv_recommend_x2apic_msrs */,
-            NULL /* hv_recommend_autoeoi_deprecation */,
-            NULL /* hv_recommend_pv_ipi */,
-            NULL /* hv_recommend_ex_hypercalls */,
-            NULL /* hv_hypervisor_is_nested */,
-            NULL /* hv_recommend_int_mbec */,
-            NULL /* hv_recommend_evmcs */,
-            NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
-        },
-        .cpuid = { .eax = 0x40000004, .reg = R_EAX, },
-    },
-    [FEAT_HV_NESTED_EAX] = {
-        .type = CPUID_FEATURE_WORD,
-        .cpuid = { .eax = 0x4000000A, .reg = R_EAX, },
-    },
     [FEAT_SVM] = {
         .type = CPUID_FEATURE_WORD,
         .feat_names = {
@@ -6953,7 +6865,7 @@ static GuestPanicInformation *x86_cpu_get_crash_info(CPUState *cs)
     CPUX86State *env = &cpu->env;
     GuestPanicInformation *panic_info = NULL;
 
-    if (env->features[FEAT_HYPERV_EDX] & HV_GUEST_CRASH_MSR_AVAILABLE) {
+    if (hyperv_feat_enabled(cpu, HYPERV_FEAT_CRASH)) {
         panic_info = g_malloc0(sizeof(GuestPanicInformation));
 
         panic_info->type = GUEST_PANIC_INFORMATION_TYPE_HYPER_V;
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a9fab5adbdfb..6220cb2cabb9 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -516,11 +516,6 @@ typedef enum FeatureWord {
     FEAT_C000_0001_EDX, /* CPUID[C000_0001].EDX */
     FEAT_KVM,           /* CPUID[4000_0001].EAX (KVM_CPUID_FEATURES) */
     FEAT_KVM_HINTS,     /* CPUID[4000_0001].EDX */
-    FEAT_HYPERV_EAX,    /* CPUID[4000_0003].EAX */
-    FEAT_HYPERV_EBX,    /* CPUID[4000_0003].EBX */
-    FEAT_HYPERV_EDX,    /* CPUID[4000_0003].EDX */
-    FEAT_HV_RECOMM_EAX, /* CPUID[4000_0004].EAX */
-    FEAT_HV_NESTED_EAX, /* CPUID[4000_000A].EAX */
     FEAT_SVM,           /* CPUID[8000_000A].EDX */
     FEAT_XSAVE,         /* CPUID[EAX=0xd,ECX=1].EAX */
     FEAT_6_EAX,         /* CPUID[6].EAX */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 2a37bdc45d17..9172a10037fc 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -803,7 +803,8 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
 static struct {
     const char *desc;
     struct {
-        uint32_t fw;
+        uint32_t func;
+        int reg;
         uint32_t bits;
     } flags[2];
     uint64_t dependencies;
@@ -811,25 +812,25 @@ static struct {
     [HYPERV_FEAT_RELAXED] = {
         .desc = "relaxed timing (hv-relaxed)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_HYPERCALL_AVAILABLE},
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_RELAXED_TIMING_RECOMMENDED}
         }
     },
     [HYPERV_FEAT_VAPIC] = {
         .desc = "virtual APIC (hv-vapic)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_HYPERCALL_AVAILABLE | HV_APIC_ACCESS_AVAILABLE},
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_APIC_ACCESS_RECOMMENDED}
         }
     },
     [HYPERV_FEAT_TIME] = {
         .desc = "clocksources (hv-time)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_HYPERCALL_AVAILABLE | HV_TIME_REF_COUNT_AVAILABLE |
              HV_REFERENCE_TSC_AVAILABLE}
         }
@@ -837,42 +838,42 @@ static struct {
     [HYPERV_FEAT_CRASH] = {
         .desc = "crash MSRs (hv-crash)",
         .flags = {
-            {.fw = FEAT_HYPERV_EDX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EDX,
              .bits = HV_GUEST_CRASH_MSR_AVAILABLE}
         }
     },
     [HYPERV_FEAT_RESET] = {
         .desc = "reset MSR (hv-reset)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_RESET_AVAILABLE}
         }
     },
     [HYPERV_FEAT_VPINDEX] = {
         .desc = "VP_INDEX MSR (hv-vpindex)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_VP_INDEX_AVAILABLE}
         }
     },
     [HYPERV_FEAT_RUNTIME] = {
         .desc = "VP_RUNTIME MSR (hv-runtime)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_VP_RUNTIME_AVAILABLE}
         }
     },
     [HYPERV_FEAT_SYNIC] = {
         .desc = "synthetic interrupt controller (hv-synic)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_SYNIC_AVAILABLE}
         }
     },
     [HYPERV_FEAT_STIMER] = {
         .desc = "synthetic timers (hv-stimer)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_SYNTIMERS_AVAILABLE}
         },
         .dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_TIME)
@@ -880,23 +881,23 @@ static struct {
     [HYPERV_FEAT_FREQUENCIES] = {
         .desc = "frequency MSRs (hv-frequencies)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_ACCESS_FREQUENCY_MSRS},
-            {.fw = FEAT_HYPERV_EDX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EDX,
              .bits = HV_FREQUENCY_MSRS_AVAILABLE}
         }
     },
     [HYPERV_FEAT_REENLIGHTENMENT] = {
         .desc = "reenlightenment MSRs (hv-reenlightenment)",
         .flags = {
-            {.fw = FEAT_HYPERV_EAX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EAX,
              .bits = HV_ACCESS_REENLIGHTENMENTS_CONTROL}
         }
     },
     [HYPERV_FEAT_TLBFLUSH] = {
         .desc = "paravirtualized TLB flush (hv-tlbflush)",
         .flags = {
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_REMOTE_TLB_FLUSH_RECOMMENDED |
              HV_EX_PROCESSOR_MASKS_RECOMMENDED}
         },
@@ -905,7 +906,7 @@ static struct {
     [HYPERV_FEAT_EVMCS] = {
         .desc = "enlightened VMCS (hv-evmcs)",
         .flags = {
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_ENLIGHTENED_VMCS_RECOMMENDED}
         },
         .dependencies = BIT(HYPERV_FEAT_VAPIC)
@@ -913,7 +914,7 @@ static struct {
     [HYPERV_FEAT_IPI] = {
         .desc = "paravirtualized IPI (hv-ipi)",
         .flags = {
-            {.fw = FEAT_HV_RECOMM_EAX,
+            {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
              .bits = HV_CLUSTER_IPI_RECOMMENDED |
              HV_EX_PROCESSOR_MASKS_RECOMMENDED}
         },
@@ -922,7 +923,7 @@ static struct {
     [HYPERV_FEAT_STIMER_DIRECT] = {
         .desc = "direct mode synthetic timers (hv-stimer-direct)",
         .flags = {
-            {.fw = FEAT_HYPERV_EDX,
+            {.func = HV_CPUID_FEATURES, .reg = R_EDX,
              .bits = HV_STIMER_DIRECT_MODE_AVAILABLE}
         },
         .dependencies = BIT(HYPERV_FEAT_STIMER)
@@ -1068,48 +1069,6 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid_legacy(CPUState *cs)
     return cpuid;
 }
 
-static int hv_cpuid_get_fw(struct kvm_cpuid2 *cpuid, int fw, uint32_t *r)
-{
-    struct kvm_cpuid_entry2 *entry;
-    uint32_t func;
-    int reg;
-
-    switch (fw) {
-    case FEAT_HYPERV_EAX:
-        reg = R_EAX;
-        func = HV_CPUID_FEATURES;
-        break;
-    case FEAT_HYPERV_EDX:
-        reg = R_EDX;
-        func = HV_CPUID_FEATURES;
-        break;
-    case FEAT_HV_RECOMM_EAX:
-        reg = R_EAX;
-        func = HV_CPUID_ENLIGHTMENT_INFO;
-        break;
-    default:
-        return -EINVAL;
-    }
-
-    entry = cpuid_find_entry(cpuid, func, 0);
-    if (!entry) {
-        return -ENOENT;
-    }
-
-    switch (reg) {
-    case R_EAX:
-        *r = entry->eax;
-        break;
-    case R_EDX:
-        *r = entry->edx;
-        break;
-    default:
-        return -EINVAL;
-    }
-
-    return 0;
-}
-
 static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
                                   int reg)
 {
@@ -1125,18 +1084,20 @@ static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
 
 static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
 {
-    uint32_t r, fw, bits;
-    int i;
+    uint32_t func, bits;
+    int i, reg;
 
     for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
-        fw = kvm_hyperv_properties[feature].flags[i].fw;
+
+        func = kvm_hyperv_properties[feature].flags[i].func;
+        reg = kvm_hyperv_properties[feature].flags[i].reg;
         bits = kvm_hyperv_properties[feature].flags[i].bits;
 
-        if (!fw) {
+        if (!func) {
             continue;
         }
 
-        if (hv_cpuid_get_fw(cpuid, fw, &r) || (r & bits) != bits) {
+        if ((hv_cpuid_get_host(cpuid, func, reg) & bits) != bits) {
             return false;
         }
     }
@@ -1186,7 +1147,7 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
     return 0;
 }
 
-static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t fw)
+static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
 {
     X86CPU *cpu = X86_CPU(cs);
     uint32_t r = 0;
@@ -1198,7 +1159,10 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t fw)
         }
 
         for (j = 0; j < ARRAY_SIZE(kvm_hyperv_properties[i].flags); j++) {
-            if (kvm_hyperv_properties[i].flags[j].fw != fw) {
+            if (kvm_hyperv_properties[i].flags[j].func != func) {
+                continue;
+            }
+            if (kvm_hyperv_properties[i].flags[j].reg != reg) {
                 continue;
             }
 
@@ -1349,16 +1313,16 @@ static int hyperv_handle_properties(CPUState *cs,
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_FEATURES;
-    c->eax = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EAX);
-    c->ebx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EBX);
-    c->edx = hv_build_cpuid_leaf(cs, FEAT_HYPERV_EDX);
+    c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EAX);
+    c->ebx = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EBX);
+    c->edx = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EDX);
 
     /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
     c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_ENLIGHTMENT_INFO;
-    c->eax = hv_build_cpuid_leaf(cs, FEAT_HV_RECOMM_EAX);
+    c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX);
     c->ebx = cpu->hyperv_spinlock_attempts;
 
     if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 10/19] i386: introduce hv_cpuid_cache
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (8 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 09/19] i386: drop FEAT_HYPERV feature leaves Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 11/19] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids() Vitaly Kuznetsov
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

Just like with cpuid_cache, it makes no sense to call
KVM_GET_SUPPORTED_HV_CPUID more than once and instead of (ab)using
env->features[] and/or trying to keep all the code in one place, it is
better to introduce persistent hv_cpuid_cache and hv_cpuid_get_host()
accessor to it.

Note, hv_cpuid_get_fw() is converted to using hv_cpuid_get_host()
just to be removed later with Hyper-V specific feature words.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 109 ++++++++++++++++++++++--------------------
 1 file changed, 56 insertions(+), 53 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 9172a10037fc..21840d34b672 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -126,6 +126,7 @@ static int has_exception_payload;
 static bool has_msr_mcg_ext_ctl;
 
 static struct kvm_cpuid2 *cpuid_cache;
+static struct kvm_cpuid2 *hv_cpuid_cache;
 static struct kvm_msr_list *kvm_feature_msrs;
 
 int kvm_has_pit_state2(void)
@@ -1069,10 +1070,25 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid_legacy(CPUState *cs)
     return cpuid;
 }
 
-static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
-                                  int reg)
+static uint32_t hv_cpuid_get_host(CPUState *cs, uint32_t func, int reg)
 {
     struct kvm_cpuid_entry2 *entry;
+    struct kvm_cpuid2 *cpuid;
+
+    if (hv_cpuid_cache) {
+        cpuid = hv_cpuid_cache;
+    } else {
+        if (kvm_check_extension(kvm_state, KVM_CAP_HYPERV_CPUID) > 0) {
+            cpuid = get_supported_hv_cpuid(cs);
+        } else {
+            cpuid = get_supported_hv_cpuid_legacy(cs);
+        }
+        hv_cpuid_cache = cpuid;
+    }
+
+    if (!cpuid) {
+        return 0;
+    }
 
     entry = cpuid_find_entry(cpuid, func, 0);
     if (!entry) {
@@ -1082,7 +1098,7 @@ static uint32_t hv_cpuid_get_host(struct kvm_cpuid2 *cpuid, uint32_t func,
     return cpuid_entry_get_reg(entry, reg);
 }
 
-static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
+static bool hyperv_feature_supported(CPUState *cs, int feature)
 {
     uint32_t func, bits;
     int i, reg;
@@ -1097,7 +1113,7 @@ static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
             continue;
         }
 
-        if ((hv_cpuid_get_host(cpuid, func, reg) & bits) != bits) {
+        if ((hv_cpuid_get_host(cs, func, reg) & bits) != bits) {
             return false;
         }
     }
@@ -1105,8 +1121,7 @@ static bool hyperv_feature_supported(struct kvm_cpuid2 *cpuid, int feature)
     return true;
 }
 
-static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
-                                  int feature)
+static int hv_cpuid_check_and_set(CPUState *cs, int feature)
 {
     X86CPU *cpu = X86_CPU(cs);
     uint64_t deps;
@@ -1129,7 +1144,7 @@ static int hv_cpuid_check_and_set(CPUState *cs, struct kvm_cpuid2 *cpuid,
         deps &= ~(1ull << dep_feat);
     }
 
-    if (!hyperv_feature_supported(cpuid, feature)) {
+    if (!hyperv_feature_supported(cs, feature)) {
         if (hyperv_feat_enabled(cpu, feature)) {
             fprintf(stderr,
                     "Hyper-V %s is not supported by kernel\n",
@@ -1182,7 +1197,6 @@ static int hyperv_handle_properties(CPUState *cs,
                                     struct kvm_cpuid_entry2 *cpuid_ent)
 {
     X86CPU *cpu = X86_CPU(cs);
-    struct kvm_cpuid2 *cpuid;
     struct kvm_cpuid_entry2 *c;
     uint32_t cpuid_i = 0;
     int r;
@@ -1208,71 +1222,65 @@ static int hyperv_handle_properties(CPUState *cs,
         }
     }
 
-    if (kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_CPUID) > 0) {
-        cpuid = get_supported_hv_cpuid(cs);
-    } else {
-        cpuid = get_supported_hv_cpuid_legacy(cs);
-    }
-
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
         cpu->hyperv_vendor_id[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_ECX);
         cpu->hyperv_vendor_id[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EDX);
         cpu->hyperv_vendor = g_realloc(cpu->hyperv_vendor,
                                        sizeof(cpu->hyperv_vendor_id) + 1);
         memcpy(cpu->hyperv_vendor, cpu->hyperv_vendor_id,
                sizeof(cpu->hyperv_vendor_id));
 
         cpu->hyperv_interface_id[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EAX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EAX);
         cpu->hyperv_interface_id[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EBX);
         cpu->hyperv_interface_id[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_ECX);
         cpu->hyperv_interface_id[3] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_INTERFACE, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EDX);
 
         cpu->hyperv_version_id[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EAX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EAX);
         cpu->hyperv_version_id[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EBX);
         cpu->hyperv_version_id[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_ECX);
         cpu->hyperv_version_id[3] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_VERSION, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EDX);
 
-        cpu->hv_max_vps = hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS,
+        cpu->hv_max_vps = hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS,
                                             R_EAX);
         cpu->hyperv_limits[0] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS, R_EBX);
         cpu->hyperv_limits[1] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_ECX);
+            hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS, R_ECX);
         cpu->hyperv_limits[2] =
-            hv_cpuid_get_host(cpuid, HV_CPUID_IMPLEMENT_LIMITS, R_EDX);
+            hv_cpuid_get_host(cs, HV_CPUID_IMPLEMENT_LIMITS, R_EDX);
 
         cpu->hyperv_spinlock_attempts =
-            hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EBX);
+            hv_cpuid_get_host(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EBX);
     }
 
     /* Features */
-    r = hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RELAXED);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_VAPIC);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_TIME);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_CRASH);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RESET);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_VPINDEX);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RUNTIME);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_SYNIC);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_STIMER);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_FREQUENCIES);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_REENLIGHTENMENT);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_TLBFLUSH);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_EVMCS);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_IPI);
-    r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_STIMER_DIRECT);
+    r = hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI);
+    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT);
 
     /* Additional dependencies not covered by kvm_hyperv_properties[] */
     if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNIC) &&
@@ -1285,8 +1293,7 @@ static int hyperv_handle_properties(CPUState *cs,
     }
 
     if (r) {
-        r = -ENOSYS;
-        goto free;
+        return -ENOSYS;
     }
 
     c = &cpuid_ent[cpuid_i++];
@@ -1328,7 +1335,7 @@ static int hyperv_handle_properties(CPUState *cs,
     if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
         c->eax |= HV_NO_NONARCH_CORESHARING;
     } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
-        c->eax |= hv_cpuid_get_host(cpuid, HV_CPUID_ENLIGHTMENT_INFO, R_EAX) &
+        c->eax |= hv_cpuid_get_host(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX) &
             HV_NO_NONARCH_CORESHARING;
     }
 
@@ -1353,12 +1360,8 @@ static int hyperv_handle_properties(CPUState *cs,
         c->function = HV_CPUID_NESTED_FEATURES;
         c->eax = cpu->hyperv_nested[0];
     }
-    r = cpuid_i;
 
-free:
-    g_free(cpuid);
-
-    return r;
+    return cpuid_i;
 }
 
 static Error *hv_passthrough_mig_blocker;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 11/19] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids()
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (9 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 10/19] i386: introduce hv_cpuid_cache Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 12/19] i386: move eVMCS enablement to hyperv_init_vcpu() Vitaly Kuznetsov
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

The intention is to call hyperv_expand_features() early, before vCPUs
are created and use the acquired data later when we set guest visible
CPUID data.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 21840d34b672..0c7bbba6c42e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1189,16 +1189,15 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
 }
 
 /*
- * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent in
- * case of success, errno < 0 in case of failure and 0 when no Hyper-V
- * extentions are enabled.
+ * Expand Hyper-V CPU features. In partucular, check that all the requested
+ * features are supported by the host and the sanity of the configuration
+ * (that all the required dependencies are included). Also, this takes care
+ * of 'hv_passthrough' mode and fills the environment with all supported
+ * Hyper-V features.
  */
-static int hyperv_handle_properties(CPUState *cs,
-                                    struct kvm_cpuid_entry2 *cpuid_ent)
+static int hyperv_expand_features(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
-    struct kvm_cpuid_entry2 *c;
-    uint32_t cpuid_i = 0;
     int r;
 
     if (!hyperv_enabled(cpu))
@@ -1296,6 +1295,19 @@ static int hyperv_handle_properties(CPUState *cs,
         return -ENOSYS;
     }
 
+    return 0;
+}
+
+/*
+ * Fill in Hyper-V CPUIDs. Returns the number of entries filled in cpuid_ent.
+ */
+static int hyperv_fill_cpuids(CPUState *cs,
+                              struct kvm_cpuid_entry2 *cpuid_ent)
+{
+    X86CPU *cpu = X86_CPU(cs);
+    struct kvm_cpuid_entry2 *c;
+    uint32_t cpuid_i = 0;
+
     c = &cpuid_ent[cpuid_i++];
     c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
     c->eax = hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ?
@@ -1503,11 +1515,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
     env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
     /* Paravirtualization CPUIDs */
-    r = hyperv_handle_properties(cs, cpuid_data.entries);
+    r = hyperv_expand_features(cs);
     if (r < 0) {
         return r;
-    } else if (r > 0) {
-        cpuid_i = r;
+    }
+
+    if (hyperv_enabled(cpu)) {
+        cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
         kvm_base = KVM_CPUID_SIGNATURE_NEXT;
         has_msr_hv_hypercall = true;
     }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 12/19] i386: move eVMCS enablement to hyperv_init_vcpu()
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (10 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 11/19] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids() Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 13/19] i386: switch hyperv_expand_features() to using error_setg() Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

hyperv_expand_features() will be called before we create vCPU so
evmcs enablement should go away. hyperv_init_vcpu() looks like the
right place.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 60 ++++++++++++++++++++++++++-----------------
 1 file changed, 37 insertions(+), 23 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 0c7bbba6c42e..a6320aeb2699 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -965,6 +965,7 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
 {
     struct kvm_cpuid2 *cpuid;
     int max = 7; /* 0x40000000..0x40000005, 0x4000000A */
+    int i;
 
     /*
      * When the buffer is too small, KVM_GET_SUPPORTED_HV_CPUID fails with
@@ -974,6 +975,22 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
     while ((cpuid = try_get_hv_cpuid(cs, max)) == NULL) {
         max++;
     }
+
+    /*
+     * KVM_GET_SUPPORTED_HV_CPUID does not set EVMCS CPUID bit before
+     * KVM_CAP_HYPERV_ENLIGHTENED_VMCS is enabled but we want to get the
+     * information early, just check for the capability and set the bit
+     * manually.
+     */
+    if (kvm_check_extension(cs->kvm_state,
+                            KVM_CAP_HYPERV_ENLIGHTENED_VMCS) > 0) {
+        for (i = 0; i < cpuid->nent; i++) {
+            if (cpuid->entries[i].function == HV_CPUID_ENLIGHTMENT_INFO) {
+                cpuid->entries[i].eax |= HV_ENLIGHTENED_VMCS_RECOMMENDED;
+            }
+        }
+    }
+
     return cpuid;
 }
 
@@ -1203,24 +1220,6 @@ static int hyperv_expand_features(CPUState *cs)
     if (!hyperv_enabled(cpu))
         return 0;
 
-    if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ||
-        cpu->hyperv_passthrough) {
-        uint16_t evmcs_version;
-
-        r = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0,
-                                (uintptr_t)&evmcs_version);
-
-        if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) && r) {
-            fprintf(stderr, "Hyper-V %s is not supported by kernel\n",
-                    kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
-            return -ENOSYS;
-        }
-
-        if (!r) {
-            cpu->hyperv_nested[0] = evmcs_version;
-        }
-    }
-
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
             hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
@@ -1457,6 +1456,21 @@ static int hyperv_init_vcpu(X86CPU *cpu)
         }
     }
 
+    if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+        uint16_t evmcs_version;
+
+        ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0,
+                                  (uintptr_t)&evmcs_version);
+
+        if (ret < 0) {
+            fprintf(stderr, "Hyper-V %s is not supported by kernel\n",
+                    kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
+            return ret;
+        }
+
+        cpu->hyperv_nested[0] = evmcs_version;
+    }
+
     return 0;
 }
 
@@ -1521,6 +1535,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     }
 
     if (hyperv_enabled(cpu)) {
+        r = hyperv_init_vcpu(cpu);
+        if (r) {
+            return r;
+        }
+
         cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
         kvm_base = KVM_CPUID_SIGNATURE_NEXT;
         has_msr_hv_hypercall = true;
@@ -1870,11 +1889,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
     kvm_init_msrs(cpu);
 
-    r = hyperv_init_vcpu(cpu);
-    if (r) {
-        goto fail;
-    }
-
     return 0;
 
  fail:
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 13/19] i386: switch hyperv_expand_features() to using error_setg()
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (11 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 12/19] i386: move eVMCS enablement to hyperv_init_vcpu() Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 14/19] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

Use standard error_setg() mechanism in hyperv_expand_features().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 101 +++++++++++++++++++++++++-----------------
 1 file changed, 61 insertions(+), 40 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index a6320aeb2699..d259916ccf85 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1138,7 +1138,7 @@ static bool hyperv_feature_supported(CPUState *cs, int feature)
     return true;
 }
 
-static int hv_cpuid_check_and_set(CPUState *cs, int feature)
+static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
 {
     X86CPU *cpu = X86_CPU(cs);
     uint64_t deps;
@@ -1152,20 +1152,18 @@ static int hv_cpuid_check_and_set(CPUState *cs, int feature)
     while (deps) {
         dep_feat = ctz64(deps);
         if (!(hyperv_feat_enabled(cpu, dep_feat))) {
-                fprintf(stderr,
-                        "Hyper-V %s requires Hyper-V %s\n",
-                        kvm_hyperv_properties[feature].desc,
-                        kvm_hyperv_properties[dep_feat].desc);
-                return 1;
+            error_setg(errp, "Hyper-V %s requires Hyper-V %s",
+                       kvm_hyperv_properties[feature].desc,
+                       kvm_hyperv_properties[dep_feat].desc);
+            return 1;
         }
         deps &= ~(1ull << dep_feat);
     }
 
     if (!hyperv_feature_supported(cs, feature)) {
         if (hyperv_feat_enabled(cpu, feature)) {
-            fprintf(stderr,
-                    "Hyper-V %s is not supported by kernel\n",
-                    kvm_hyperv_properties[feature].desc);
+            error_setg(errp, "Hyper-V %s is not supported by kernel",
+                       kvm_hyperv_properties[feature].desc);
             return 1;
         } else {
             return 0;
@@ -1212,13 +1210,12 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
  * of 'hv_passthrough' mode and fills the environment with all supported
  * Hyper-V features.
  */
-static int hyperv_expand_features(CPUState *cs)
+static void hyperv_expand_features(CPUState *cs, Error **errp)
 {
     X86CPU *cpu = X86_CPU(cs);
-    int r;
 
     if (!hyperv_enabled(cpu))
-        return 0;
+        return;
 
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
@@ -1264,37 +1261,60 @@ static int hyperv_expand_features(CPUState *cs)
     }
 
     /* Features */
-    r = hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI);
-    r |= hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT);
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI, errp)) {
+        return;
+    }
+    if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT, errp)) {
+        return;
+    }
 
     /* Additional dependencies not covered by kvm_hyperv_properties[] */
     if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNIC) &&
         !cpu->hyperv_synic_kvm_only &&
         !hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX)) {
-        fprintf(stderr, "Hyper-V %s requires Hyper-V %s\n",
-                kvm_hyperv_properties[HYPERV_FEAT_SYNIC].desc,
-                kvm_hyperv_properties[HYPERV_FEAT_VPINDEX].desc);
-        r |= 1;
-    }
-
-    if (r) {
-        return -ENOSYS;
+        error_setg(errp, "Hyper-V %s requires Hyper-V %s",
+                   kvm_hyperv_properties[HYPERV_FEAT_SYNIC].desc,
+                   kvm_hyperv_properties[HYPERV_FEAT_VPINDEX].desc);
     }
-
-    return 0;
 }
 
 /*
@@ -1529,9 +1549,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
     env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
     /* Paravirtualization CPUIDs */
-    r = hyperv_expand_features(cs);
-    if (r < 0) {
-        return r;
+    hyperv_expand_features(cs, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+        return -ENOSYS;
     }
 
     if (hyperv_enabled(cpu)) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 14/19] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (12 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 13/19] i386: switch hyperv_expand_features() to using error_setg() Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 15/19] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

SYNDBG leaves were recently (Linux-5.8) added to KVM but we haven't
updated the expected size of KVM_GET_SUPPORTED_HV_CPUID output in
KVM so we now make serveral tries before succeeding. Update the
default.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d259916ccf85..d97bab04b0fd 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -964,7 +964,8 @@ static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
 static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
 {
     struct kvm_cpuid2 *cpuid;
-    int max = 7; /* 0x40000000..0x40000005, 0x4000000A */
+    /* 0x40000000..0x40000005, 0x4000000A, 0x40000080..0x40000080 leaves */
+    int max = 10;
     int i;
 
     /*
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 15/19] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (13 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 14/19] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 16/19] i386: use global kvm_state in hyperv_enabled() check Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

KVM_GET_SUPPORTED_HV_CPUID was made a system wide ioctl which can be called
prior to creating vCPUs and we are going to use that to expand Hyper-V cpu
features early. Use it when it is supported by KVM.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d97bab04b0fd..a8858b93f3d4 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -931,7 +931,8 @@ static struct {
     },
 };
 
-static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
+static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
+                                           bool do_sys_ioctl)
 {
     struct kvm_cpuid2 *cpuid;
     int r, size;
@@ -940,7 +941,11 @@ static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max)
     cpuid = g_malloc0(size);
     cpuid->nent = max;
 
-    r = kvm_vcpu_ioctl(cs, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
+    if (do_sys_ioctl) {
+        r = kvm_ioctl(kvm_state, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
+    } else {
+        r = kvm_vcpu_ioctl(cs, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
+    }
     if (r == 0 && cpuid->nent >= max) {
         r = -E2BIG;
     }
@@ -967,13 +972,17 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
     /* 0x40000000..0x40000005, 0x4000000A, 0x40000080..0x40000080 leaves */
     int max = 10;
     int i;
+    bool do_sys_ioctl;
+
+    do_sys_ioctl =
+        kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID) > 0;
 
     /*
      * When the buffer is too small, KVM_GET_SUPPORTED_HV_CPUID fails with
      * -E2BIG, however, it doesn't report back the right size. Keep increasing
      * it and re-trying until we succeed.
      */
-    while ((cpuid = try_get_hv_cpuid(cs, max)) == NULL) {
+    while ((cpuid = try_get_hv_cpuid(cs, max, do_sys_ioctl)) == NULL) {
         max++;
     }
 
@@ -983,7 +992,7 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState *cs)
      * information early, just check for the capability and set the bit
      * manually.
      */
-    if (kvm_check_extension(cs->kvm_state,
+    if (!do_sys_ioctl && kvm_check_extension(cs->kvm_state,
                             KVM_CAP_HYPERV_ENLIGHTENED_VMCS) > 0) {
         for (i = 0; i < cpuid->nent; i++) {
             if (cpuid->entries[i].function == HV_CPUID_ENLIGHTMENT_INFO) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 16/19] i386: use global kvm_state in hyperv_enabled() check
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (14 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 15/19] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 17/19] i386: expand Hyper-V features during CPU feature expansion time Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

There is no need to use vCPU-specific kvm state in hyperv_enabled() check
and we need to do that when feature expansion happens early, before vCPU
specific KVM state is created.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index a8858b93f3d4..36309cda3860 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -718,8 +718,7 @@ unsigned long kvm_arch_vcpu_id(CPUState *cs)
 
 static bool hyperv_enabled(X86CPU *cpu)
 {
-    CPUState *cs = CPU(cpu);
-    return kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV) > 0 &&
+    return kvm_check_extension(kvm_state, KVM_CAP_HYPERV) > 0 &&
         ((cpu->hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_NOTIFY) ||
          cpu->hyperv_features || cpu->hyperv_passthrough);
 }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 17/19] i386: expand Hyper-V features during CPU feature expansion time
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (15 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 16/19] i386: use global kvm_state in hyperv_enabled() check Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 18/19] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
  2021-01-07 15:14 ` [PATCH v3 19/19] qtest/hyperv: Introduce a simple hyper-v test Vitaly Kuznetsov
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we
need to expand and set the corresponding CPUID leaves early. Modify
x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V
specific kvm_hv_get_supported_cpuid() instead of
kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid()
as Hyper-V specific CPUID leaves intersect with KVM's.

Note, early expansion will only happen when KVM supports system wide
KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/cpu.c          |  4 ++++
 target/i386/kvm/kvm-stub.c |  5 +++++
 target/i386/kvm/kvm.c      | 15 ++++++++++++---
 target/i386/kvm/kvm_i386.h |  1 +
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9f6cabfc7787..48007a876e32 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6398,6 +6398,10 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
     if (env->cpuid_xlevel2 == UINT32_MAX) {
         env->cpuid_xlevel2 = env->cpuid_min_xlevel2;
     }
+
+    if (kvm_enabled()) {
+        kvm_hyperv_expand_features(cpu, errp);
+    }
 }
 
 /*
diff --git a/target/i386/kvm/kvm-stub.c b/target/i386/kvm/kvm-stub.c
index 0a163ae207c5..20994c3a16bf 100644
--- a/target/i386/kvm/kvm-stub.c
+++ b/target/i386/kvm/kvm-stub.c
@@ -44,3 +44,8 @@ bool kvm_hv_evmcs_available(void)
 {
     return false;
 }
+
+void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
+{
+    return;
+}
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 36309cda3860..40c5589c6af6 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1219,13 +1219,22 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
  * of 'hv_passthrough' mode and fills the environment with all supported
  * Hyper-V features.
  */
-static void hyperv_expand_features(CPUState *cs, Error **errp)
+void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
 {
-    X86CPU *cpu = X86_CPU(cs);
+    CPUState *cs = CPU(cpu);
 
     if (!hyperv_enabled(cpu))
         return;
 
+    /*
+     * When kvm_hyperv_expand_features is called at CPU feature expansion
+     * time per-CPU kvm_state is not available yet so we can only proceed
+     * when KVM_CAP_SYS_HYPERV_CPUID is supported.
+     */
+    if (!cs->kvm_state &&
+        !kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID))
+        return;
+
     if (cpu->hyperv_passthrough) {
         cpu->hyperv_vendor_id[0] =
             hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
@@ -1558,7 +1567,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
     env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
     /* Paravirtualization CPUIDs */
-    hyperv_expand_features(cs, &local_err);
+    kvm_hyperv_expand_features(cpu, &local_err);
     if (local_err) {
         error_report_err(local_err);
         return -ENOSYS;
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 08968cfb33f1..f0d8afbc53e6 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -48,6 +48,7 @@ bool kvm_has_waitpkg(void);
 
 bool kvm_hv_vpindex_settable(void);
 bool kvm_hv_evmcs_available(void);
+void kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
 
 uint64_t kvm_swizzle_msi_ext_dest_id(uint64_t address);
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (16 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 17/19] i386: expand Hyper-V features during CPU feature expansion time Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  2021-01-15  2:11   ` Igor Mammedov
  2021-01-07 15:14 ` [PATCH v3 19/19] qtest/hyperv: Introduce a simple hyper-v test Vitaly Kuznetsov
  18 siblings, 1 reply; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
requires listing all currently supported enlightenments ("hv-*" CPU
features) explicitly. We do have 'hv-passthrough' mode enabling
everything but it can't be used in production as it prevents migration.

Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
Hyper-V enlightenments. Later, when new enlightenments get implemented,
compat_props mechanism will be used to disable them for legacy machine types,
this will keep 'hv-default=on' configurations migratable.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 docs/hyperv.txt   | 16 +++++++++++++---
 target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
 target/i386/cpu.h |  5 +++++
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 5df00da54fc4..a54c066cab09 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
 
 2. Setup
 =========
-No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
-QEMU, individual enlightenments can be enabled through CPU flags, e.g:
+All currently supported Hyper-V enlightenments can be enabled by specifying
+'hv-default=on' CPU flag:
 
-  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
+  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
+
+Alternatively, it is possible to do fine-grained enablement through CPU flags,
+e.g:
+
+  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...
+
+It is also possible to disable individual enlightenments from the default list,
+this can be used for debugging purposes:
+
+  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
 
 Sometimes there are dependencies between enlightenments, QEMU is supposed to
 check that the supplied configuration is sane.
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 48007a876e32..99338de00f78 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
     cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
 }
 
+static bool x86_hv_default_get(Object *obj, Error **errp)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    return cpu->hyperv_default;
+}
+
+static void x86_hv_default_set(Object *obj, bool value, Error **errp)
+{
+    X86CPU *cpu = X86_CPU(obj);
+
+    cpu->hyperv_default = value;
+
+    if (value) {
+        cpu->hyperv_features |= cpu->hyperv_default_features;
+    }
+}
+
 /* Generic getter for "feature-words" and "filtered-features" properties */
 static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
                                       const char *name, void *opaque,
@@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
     object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
     object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
     object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
+    object_property_add_alias(obj, "hv_default", obj, "hv-default");
 
     if (xcc->model) {
         x86_cpu_load_model(cpu, xcc->model);
     }
+
+    /* Hyper-V features enabled with 'hv-default=on' */
+    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
+        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
+        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
+        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
+        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
+        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
+        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
+        BIT(HYPERV_FEAT_STIMER_DIRECT);
+
+    /* Enlightened VMCS is only available on Intel/VMX */
+    if (kvm_hv_evmcs_available()) {
+        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
+    }
 }
 
 static int64_t x86_cpu_get_arch_id(CPUState *cs)
@@ -7285,6 +7319,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
                               x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
 #endif
 
+    object_class_property_add_bool(oc, "hv-default",
+                              x86_hv_default_get,
+                              x86_hv_default_set);
+
     for (w = 0; w < FEATURE_WORDS; w++) {
         int bitnr;
         for (bitnr = 0; bitnr < 64; bitnr++) {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 6220cb2cabb9..8a484becb6b9 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1657,6 +1657,11 @@ struct X86CPU {
     bool hyperv_synic_kvm_only;
     uint64_t hyperv_features;
     bool hyperv_passthrough;
+
+    /* 'hv-default' enablement */
+    uint64_t hyperv_default_features;
+    bool hyperv_default;
+
     OnOffAuto hyperv_no_nonarch_cs;
     uint32_t hyperv_vendor_id[3];
     uint32_t hyperv_interface_id[4];
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 19/19] qtest/hyperv: Introduce a simple hyper-v test
  2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
                   ` (17 preceding siblings ...)
  2021-01-07 15:14 ` [PATCH v3 18/19] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
@ 2021-01-07 15:14 ` Vitaly Kuznetsov
  18 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-07 15:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Marcelo Tosatti, Eduardo Habkost, Igor Mammedov

For the beginning, just test 'hv-default', 'hv-passthrough' and a couple
of custom Hyper-V enlightenments configurations through QMP. Later, it
would be great to complement this by checking CPUID values from within the
guest.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 MAINTAINERS               |   1 +
 tests/qtest/hyperv-test.c | 238 ++++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build   |   3 +-
 3 files changed, 241 insertions(+), 1 deletion(-)
 create mode 100644 tests/qtest/hyperv-test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 171e7047aaaa..bb44007d795d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1495,6 +1495,7 @@ F: hw/isa/apm.c
 F: include/hw/isa/apm.h
 F: tests/test-x86-cpuid.c
 F: tests/qtest/test-x86-cpuid-compat.c
+F: tests/qtest/hyperv-test.c
 
 PC Chipset
 M: Michael S. Tsirkin <mst@redhat.com>
diff --git a/tests/qtest/hyperv-test.c b/tests/qtest/hyperv-test.c
new file mode 100644
index 000000000000..029d1f8cb46e
--- /dev/null
+++ b/tests/qtest/hyperv-test.c
@@ -0,0 +1,238 @@
+/*
+ * Hyper-V emulation CPU feature test cases
+ *
+ * Copyright (c) 2021 Red Hat Inc.
+ * Authors:
+ *  Vitaly Kuznetsov <vkuznets@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include <linux/kvm.h>
+#include <sys/ioctl.h>
+
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qjson.h"
+
+#define MACHINE_KVM "-machine pc-q35-5.2 -accel kvm "
+#define QUERY_HEAD  "{ 'execute': 'query-cpu-model-expansion', " \
+                    "  'arguments': { 'type': 'full', "
+#define QUERY_TAIL  "}}"
+
+static bool kvm_enabled(QTestState *qts)
+{
+    QDict *resp, *qdict;
+    bool enabled;
+
+    resp = qtest_qmp(qts, "{ 'execute': 'query-kvm' }");
+    g_assert(qdict_haskey(resp, "return"));
+    qdict = qdict_get_qdict(resp, "return");
+    g_assert(qdict_haskey(qdict, "enabled"));
+    enabled = qdict_get_bool(qdict, "enabled");
+    qobject_unref(resp);
+
+    return enabled;
+}
+
+static bool kvm_has_sys_hyperv_cpuid(void)
+{
+    int fd = open("/dev/kvm", O_RDWR);
+    int ret;
+
+    g_assert(fd > 0);
+
+    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_SYS_HYPERV_CPUID);
+
+    close(fd);
+
+    return ret > 0;
+}
+
+static QDict *do_query_no_props(QTestState *qts, const char *cpu_type)
+{
+    return qtest_qmp(qts, QUERY_HEAD "'model': { 'name': %s }"
+                          QUERY_TAIL, cpu_type);
+}
+
+static bool resp_has_props(QDict *resp)
+{
+    QDict *qdict;
+
+    g_assert(resp);
+
+    if (!qdict_haskey(resp, "return")) {
+        return false;
+    }
+    qdict = qdict_get_qdict(resp, "return");
+
+    if (!qdict_haskey(qdict, "model")) {
+        return false;
+    }
+    qdict = qdict_get_qdict(qdict, "model");
+
+    return qdict_haskey(qdict, "props");
+}
+
+static QDict *resp_get_props(QDict *resp)
+{
+    QDict *qdict;
+
+    g_assert(resp);
+    g_assert(resp_has_props(resp));
+
+    qdict = qdict_get_qdict(resp, "return");
+    qdict = qdict_get_qdict(qdict, "model");
+    qdict = qdict_get_qdict(qdict, "props");
+
+    return qdict;
+}
+
+static bool resp_get_feature(QDict *resp, const char *feature)
+{
+    QDict *props;
+
+    g_assert(resp);
+    g_assert(resp_has_props(resp));
+    props = resp_get_props(resp);
+    g_assert(qdict_get(props, feature));
+    return qdict_get_bool(props, feature);
+}
+
+#define assert_has_feature(qts, cpu_type, feature)                     \
+({                                                                     \
+    QDict *_resp = do_query_no_props(qts, cpu_type);                   \
+    g_assert(_resp);                                                   \
+    g_assert(resp_has_props(_resp));                                   \
+    g_assert(qdict_get(resp_get_props(_resp), feature));               \
+    qobject_unref(_resp);                                              \
+})
+
+#define resp_assert_feature(resp, feature, expected_value)             \
+({                                                                     \
+    QDict *_props;                                                     \
+                                                                       \
+    g_assert(_resp);                                                   \
+    g_assert(resp_has_props(_resp));                                   \
+    _props = resp_get_props(_resp);                                    \
+    g_assert(qdict_get(_props, feature));                              \
+    g_assert(qdict_get_bool(_props, feature) == (expected_value));     \
+})
+
+#define assert_feature(qts, cpu_type, feature, expected_value)         \
+({                                                                     \
+    QDict *_resp;                                                      \
+                                                                       \
+    _resp = do_query_no_props(qts, cpu_type);                          \
+    g_assert(_resp);                                                   \
+    resp_assert_feature(_resp, feature, expected_value);               \
+    qobject_unref(_resp);                                              \
+})
+
+#define assert_has_feature_enabled(qts, cpu_type, feature)             \
+    assert_feature(qts, cpu_type, feature, true)
+
+#define assert_has_feature_disabled(qts, cpu_type, feature)            \
+    assert_feature(qts, cpu_type, feature, false)
+
+static void test_assert_hyperv_all(QTestState *qts)
+{
+    QDict *resp;
+
+    assert_has_feature_enabled(qts, "host", "hv-relaxed");
+    assert_has_feature_enabled(qts, "host", "hv-vapic");
+    assert_has_feature_enabled(qts, "host", "hv-vpindex");
+    assert_has_feature_enabled(qts, "host", "hv-runtime");
+    assert_has_feature_enabled(qts, "host", "hv-crash");
+    assert_has_feature_enabled(qts, "host", "hv-time");
+    assert_has_feature_enabled(qts, "host", "hv-synic");
+    assert_has_feature_enabled(qts, "host", "hv-stimer");
+    assert_has_feature_enabled(qts, "host", "hv-tlbflush");
+    assert_has_feature_enabled(qts, "host", "hv-ipi");
+    assert_has_feature_enabled(qts, "host", "hv-reset");
+    assert_has_feature_enabled(qts, "host", "hv-frequencies");
+    assert_has_feature_enabled(qts, "host", "hv-reenlightenment");
+    assert_has_feature_enabled(qts, "host", "hv-stimer-direct");
+
+    resp = do_query_no_props(qts, "host");
+    if (resp_get_feature(resp, "vmx")) {
+        assert_has_feature_enabled(qts, "host", "hv-evmcs");
+    } else {
+        assert_has_feature_disabled(qts, "host", "hv-evmcs");
+    }
+
+}
+
+static void test_query_cpu_hv_default(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-default");
+
+    test_assert_hyperv_all(qts);
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_default_minus(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-default,hv_ipi=off");
+
+    assert_has_feature_enabled(qts, "host", "hv-tlbflush");
+    assert_has_feature_disabled(qts, "host", "hv-ipi");
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_custom(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-vpindex");
+
+    assert_has_feature_enabled(qts, "host", "hv-vpindex");
+    assert_has_feature_disabled(qts, "host", "hv-synic");
+
+    qtest_quit(qts);
+}
+
+static void test_query_cpu_hv_passthrough(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_KVM "-cpu host,hv-passthrough");
+    if (!kvm_enabled(qts)) {
+        qtest_quit(qts);
+        return;
+    }
+
+    test_assert_hyperv_all(qts);
+
+    qtest_quit(qts);
+}
+
+int main(int argc, char **argv)
+{
+    const char *arch = qtest_get_arch();
+
+    g_test_init(&argc, &argv, NULL);
+
+    if (!strcmp(arch, "i386") || !strcmp(arch, "x86_64")) {
+        qtest_add_data_func("/hyperv/hv-default",
+                            NULL, test_query_cpu_hv_default);
+        qtest_add_data_func("/hyperv/hv-default-minus",
+                            NULL, test_query_cpu_hv_default_minus);
+        qtest_add_data_func("/hyperv/hv-custom",
+                            NULL, test_query_cpu_hv_custom);
+        if (kvm_has_sys_hyperv_cpuid()) {
+            qtest_add_data_func("/hyperv/hv-passthrough",
+                                NULL, test_query_cpu_hv_passthrough);
+        }
+    }
+
+    return g_test_run();
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 6a67c538be12..fcbe425626f4 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -64,7 +64,8 @@ qtests_i386 = \
    'vmgenid-test',
    'migration-test',
    'test-x86-cpuid-compat',
-   'numa-test']
+   'numa-test',
+   'hyperv-test']
 
 dbus_daemon = find_program('dbus-daemon', required: false)
 if dbus_daemon.found() and config_host.has_key('GDBUS_CODEGEN')
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-07 15:14 ` [PATCH v3 18/19] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
@ 2021-01-15  2:11   ` Igor Mammedov
  2021-01-15  9:20     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 34+ messages in thread
From: Igor Mammedov @ 2021-01-15  2:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Thu,  7 Jan 2021 16:14:49 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> requires listing all currently supported enlightenments ("hv-*" CPU
> features) explicitly. We do have 'hv-passthrough' mode enabling
> everything but it can't be used in production as it prevents migration.
> 
> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> compat_props mechanism will be used to disable them for legacy machine types,
> this will keep 'hv-default=on' configurations migratable.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  docs/hyperv.txt   | 16 +++++++++++++---
>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
>  target/i386/cpu.h |  5 +++++
>  3 files changed, 56 insertions(+), 3 deletions(-)
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> index 5df00da54fc4..a54c066cab09 100644
> --- a/docs/hyperv.txt
> +++ b/docs/hyperv.txt
> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
>  
>  2. Setup
>  =========
> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> +All currently supported Hyper-V enlightenments can be enabled by specifying
> +'hv-default=on' CPU flag:
>  
> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> +
> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> +e.g:
> +
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...

I'd put here not '...' but rather recommended list of flags, and update
it every time when new feature added if necessary.

(not to mention that if we had it to begin with, then new 'hv-default' won't
be necessary, I still see it as functionality duplication but I will not oppose it)


> +It is also possible to disable individual enlightenments from the default list,
> +this can be used for debugging purposes:
> +
> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
>  
>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
>  check that the supplied configuration is sane.
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 48007a876e32..99338de00f78 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
>  }
>  
> +static bool x86_hv_default_get(Object *obj, Error **errp)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    return cpu->hyperv_default;
> +}
> +
> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> +{
> +    X86CPU *cpu = X86_CPU(obj);
> +
> +    cpu->hyperv_default = value;
> +
> +    if (value) {
> +        cpu->hyperv_features |= cpu->hyperv_default_features;

s/|="/=/ please,
i.e. no option overrides whatever was specified before to keep semantics consistent.

> +    }
> +}
> +
>  /* Generic getter for "feature-words" and "filtered-features" properties */
>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>                                        const char *name, void *opaque,
> @@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
>      object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
>      object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
>      object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
> +    object_property_add_alias(obj, "hv_default", obj, "hv-default");
>  
>      if (xcc->model) {
>          x86_cpu_load_model(cpu, xcc->model);
>      }
> +
> +    /* Hyper-V features enabled with 'hv-default=on' */
> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
> +
> +    /* Enlightened VMCS is only available on Intel/VMX */
> +    if (kvm_hv_evmcs_available()) {
> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
> +    }
what if VVM is migrated to another host without evmcs,
will it change CPUID?

>  }
>  
>  static int64_t x86_cpu_get_arch_id(CPUState *cs)
> @@ -7285,6 +7319,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>                                x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
>  #endif
>  
> +    object_class_property_add_bool(oc, "hv-default",
> +                              x86_hv_default_get,
> +                              x86_hv_default_set);
> +
>      for (w = 0; w < FEATURE_WORDS; w++) {
>          int bitnr;
>          for (bitnr = 0; bitnr < 64; bitnr++) {
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 6220cb2cabb9..8a484becb6b9 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1657,6 +1657,11 @@ struct X86CPU {
>      bool hyperv_synic_kvm_only;
>      uint64_t hyperv_features;
>      bool hyperv_passthrough;
> +
> +    /* 'hv-default' enablement */
> +    uint64_t hyperv_default_features;
> +    bool hyperv_default;
> +
>      OnOffAuto hyperv_no_nonarch_cs;
>      uint32_t hyperv_vendor_id[3];
>      uint32_t hyperv_interface_id[4];



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-15  2:11   ` Igor Mammedov
@ 2021-01-15  9:20     ` Vitaly Kuznetsov
  2021-01-20 13:13       ` Igor Mammedov
  0 siblings, 1 reply; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-15  9:20 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Thu,  7 Jan 2021 16:14:49 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
>> requires listing all currently supported enlightenments ("hv-*" CPU
>> features) explicitly. We do have 'hv-passthrough' mode enabling
>> everything but it can't be used in production as it prevents migration.
>> 
>> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
>> Hyper-V enlightenments. Later, when new enlightenments get implemented,
>> compat_props mechanism will be used to disable them for legacy machine types,
>> this will keep 'hv-default=on' configurations migratable.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  docs/hyperv.txt   | 16 +++++++++++++---
>>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
>>  target/i386/cpu.h |  5 +++++
>>  3 files changed, 56 insertions(+), 3 deletions(-)
>> 
>> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> index 5df00da54fc4..a54c066cab09 100644
>> --- a/docs/hyperv.txt
>> +++ b/docs/hyperv.txt
>> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
>>  
>>  2. Setup
>>  =========
>> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
>> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
>> +All currently supported Hyper-V enlightenments can be enabled by specifying
>> +'hv-default=on' CPU flag:
>>  
>> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
>> +
>> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
>> +e.g:
>> +
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...
>
> I'd put here not '...' but rather recommended list of flags, and update
> it every time when new feature added if necessary.
>

This is an example of fine-grained enablement, there is no point to put
all the existing flags there (hv-default is the only recommended way
now, the rest is 'expert'/'debugging').

> (not to mention that if we had it to begin with, then new 'hv-default' won't
> be necessary, I still see it as functionality duplication but I will not oppose it)
>

Unfortunately, upper layer tools don't read this doc and update
themselves to enable new features when they appear. Similarly, if when
these tools use '-machine q35' they get all the new features we add
automatically, right?

>
>> +It is also possible to disable individual enlightenments from the default list,
>> +this can be used for debugging purposes:
>> +
>> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
>>  
>>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
>>  check that the supplied configuration is sane.
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 48007a876e32..99338de00f78 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
>>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
>>  }
>>  
>> +static bool x86_hv_default_get(Object *obj, Error **errp)
>> +{
>> +    X86CPU *cpu = X86_CPU(obj);
>> +
>> +    return cpu->hyperv_default;
>> +}
>> +
>> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
>> +{
>> +    X86CPU *cpu = X86_CPU(obj);
>> +
>> +    cpu->hyperv_default = value;
>> +
>> +    if (value) {
>> +        cpu->hyperv_features |= cpu->hyperv_default_features;
>
> s/|="/=/ please,
> i.e. no option overrides whatever was specified before to keep semantics consistent.
>

Hm,

this doesn't matter for the most recent machine type as
hyperv_default_features has all the features but imagine you're running
an older machine type which doesn't have 'hv_feature'. Now your
suggestion is 

if I do:

'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"

but if I do

'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
(as hv_default enablement will overwrite everything)

How is this consistent?

>> +    }
>> +}
>> +
>>  /* Generic getter for "feature-words" and "filtered-features" properties */
>>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>>                                        const char *name, void *opaque,
>> @@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
>>      object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
>>      object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
>>      object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
>> +    object_property_add_alias(obj, "hv_default", obj, "hv-default");
>>  
>>      if (xcc->model) {
>>          x86_cpu_load_model(cpu, xcc->model);
>>      }
>> +
>> +    /* Hyper-V features enabled with 'hv-default=on' */
>> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
>> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
>> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
>> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
>> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
>> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
>> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
>> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
>> +
>> +    /* Enlightened VMCS is only available on Intel/VMX */
>> +    if (kvm_hv_evmcs_available()) {
>> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
>> +    }
> what if VVM is migrated to another host without evmcs,
> will it change CPUID?
>

Evmcs is tightly coupled with VMX, we can't migrate when it's not
there.

>>  }
>>  
>>  static int64_t x86_cpu_get_arch_id(CPUState *cs)
>> @@ -7285,6 +7319,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>>                                x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
>>  #endif
>>  
>> +    object_class_property_add_bool(oc, "hv-default",
>> +                              x86_hv_default_get,
>> +                              x86_hv_default_set);
>> +
>>      for (w = 0; w < FEATURE_WORDS; w++) {
>>          int bitnr;
>>          for (bitnr = 0; bitnr < 64; bitnr++) {
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index 6220cb2cabb9..8a484becb6b9 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -1657,6 +1657,11 @@ struct X86CPU {
>>      bool hyperv_synic_kvm_only;
>>      uint64_t hyperv_features;
>>      bool hyperv_passthrough;
>> +
>> +    /* 'hv-default' enablement */
>> +    uint64_t hyperv_default_features;
>> +    bool hyperv_default;
>> +
>>      OnOffAuto hyperv_no_nonarch_cs;
>>      uint32_t hyperv_vendor_id[3];
>>      uint32_t hyperv_interface_id[4];
>

-- 
Vitaly



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-15  9:20     ` Vitaly Kuznetsov
@ 2021-01-20 13:13       ` Igor Mammedov
  2021-01-20 14:38         ` Vitaly Kuznetsov
  2021-01-20 19:55         ` Eduardo Habkost
  0 siblings, 2 replies; 34+ messages in thread
From: Igor Mammedov @ 2021-01-20 13:13 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Fri, 15 Jan 2021 10:20:23 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Thu,  7 Jan 2021 16:14:49 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> >> requires listing all currently supported enlightenments ("hv-*" CPU
> >> features) explicitly. We do have 'hv-passthrough' mode enabling
> >> everything but it can't be used in production as it prevents migration.
> >> 
> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> >> compat_props mechanism will be used to disable them for legacy machine types,
> >> this will keep 'hv-default=on' configurations migratable.
> >> 
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> ---
> >>  docs/hyperv.txt   | 16 +++++++++++++---
> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
> >>  target/i386/cpu.h |  5 +++++
> >>  3 files changed, 56 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> >> index 5df00da54fc4..a54c066cab09 100644
> >> --- a/docs/hyperv.txt
> >> +++ b/docs/hyperv.txt
> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
> >>  
> >>  2. Setup
> >>  =========
> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
> >> +'hv-default=on' CPU flag:
> >>  
> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> >> +
> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> >> +e.g:
> >> +
> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...  
> >
> > I'd put here not '...' but rather recommended list of flags, and update
> > it every time when new feature added if necessary.
> >  

1)
 
> This is an example of fine-grained enablement, there is no point to put
> all the existing flags there (hv-default is the only recommended way
> now, the rest is 'expert'/'debugging').
so users are kept in dark what hv-default disables/enables (and it might depend
on machine version on top that). Doesn't look like a good documentation to me
(sure everyone can go and read source code for it and try to figure out how
it's supposed to work)

>
> > (not to mention that if we had it to begin with, then new 'hv-default' won't
> > be necessary, I still see it as functionality duplication but I will not oppose it)
> >  
> 
> Unfortunately, upper layer tools don't read this doc and update
> themselves to enable new features when they appear.
rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
why stop there, just merge with yet another upper layer, it would save us a lot
on communication protocols and simplify VM creation even more,
and no one will have to read docs and write anything new on top.)
There should be limit somewhere, where QEMU job ends and others pile hw abstraction
layers on top of it.

> Similarly, if when these tools use '-machine q35' they get all the new features we add
> automatically, right?
it depends, in case of CPUs, new features usually 'off' by default
for existing models. In case of bugs, features sometimes could be
flipped and versioned machines were used to keep broken CPU models
on old machine types.

   
> >> +It is also possible to disable individual enlightenments from the default list,
> >> +this can be used for debugging purposes:
> >> +
> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
> >>  
> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
> >>  check that the supplied configuration is sane.
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index 48007a876e32..99338de00f78 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
> >>  }
> >>  
> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
> >> +{
> >> +    X86CPU *cpu = X86_CPU(obj);
> >> +
> >> +    return cpu->hyperv_default;
> >> +}
> >> +
> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> >> +{
> >> +    X86CPU *cpu = X86_CPU(obj);
> >> +
> >> +    cpu->hyperv_default = value;
> >> +
> >> +    if (value) {
> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;  
> >
> > s/|="/=/ please,
> > i.e. no option overrides whatever was specified before to keep semantics consistent.
> >  
> 
> Hm,
> 

> this doesn't matter for the most recent machine type as
> hyperv_default_features has all the features but imagine you're running
> an older machine type which doesn't have 'hv_feature'. Now your
normally one shouldn't use new feature with old machine type as it makes
VM non-migratable to older QEMU that has this machine type but not this feature.

nitpicking:
  according to (1) user should not use 'hv_feature' on old machine since
  hv_default should cover all their needs (well they don't know what hv_default actually is).

> suggestion is 
> 
> if I do:
> 
> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
> 
> but if I do
> 
> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
> (as hv_default enablement will overwrite everything)
> 
> How is this consistent?
usual semantics for properties, is that the latest property overwrites,
the previous property value parsed from left to right.
(i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
if one needs more than that one should add more related features after that.


> >> +    }
> >> +}
> >> +
> >>  /* Generic getter for "feature-words" and "filtered-features" properties */
> >>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
> >>                                        const char *name, void *opaque,
> >> @@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
> >>      object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
> >>      object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
> >>      object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
> >> +    object_property_add_alias(obj, "hv_default", obj, "hv-default");
> >>  
> >>      if (xcc->model) {
> >>          x86_cpu_load_model(cpu, xcc->model);
> >>      }
> >> +
> >> +    /* Hyper-V features enabled with 'hv-default=on' */
> >> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
> >> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
> >> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
> >> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
> >> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
> >> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
> >> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
> >> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
> >> +
> >> +    /* Enlightened VMCS is only available on Intel/VMX */
> >> +    if (kvm_hv_evmcs_available()) {
> >> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
> >> +    }  
> > what if VVM is migrated to another host without evmcs,
> > will it change CPUID?
> >  
> 
> Evmcs is tightly coupled with VMX, we can't migrate when it's not
> there.

Are you saying mgmt will check and refuse to migrate to such host?

> 
> >>  }
> >>  
> >>  static int64_t x86_cpu_get_arch_id(CPUState *cs)
> >> @@ -7285,6 +7319,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
> >>                                x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
> >>  #endif
> >>  
> >> +    object_class_property_add_bool(oc, "hv-default",
> >> +                              x86_hv_default_get,
> >> +                              x86_hv_default_set);
> >> +
> >>      for (w = 0; w < FEATURE_WORDS; w++) {
> >>          int bitnr;
> >>          for (bitnr = 0; bitnr < 64; bitnr++) {
> >> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> >> index 6220cb2cabb9..8a484becb6b9 100644
> >> --- a/target/i386/cpu.h
> >> +++ b/target/i386/cpu.h
> >> @@ -1657,6 +1657,11 @@ struct X86CPU {
> >>      bool hyperv_synic_kvm_only;
> >>      uint64_t hyperv_features;
> >>      bool hyperv_passthrough;
> >> +
> >> +    /* 'hv-default' enablement */
> >> +    uint64_t hyperv_default_features;
> >> +    bool hyperv_default;
> >> +
> >>      OnOffAuto hyperv_no_nonarch_cs;
> >>      uint32_t hyperv_vendor_id[3];
> >>      uint32_t hyperv_interface_id[4];  
> >  
> 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-20 13:13       ` Igor Mammedov
@ 2021-01-20 14:38         ` Vitaly Kuznetsov
  2021-01-20 19:08           ` Igor Mammedov
  2021-01-20 19:55         ` Eduardo Habkost
  1 sibling, 1 reply; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-20 14:38 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Fri, 15 Jan 2021 10:20:23 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Thu,  7 Jan 2021 16:14:49 +0100
>> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >  
>> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
>> >> requires listing all currently supported enlightenments ("hv-*" CPU
>> >> features) explicitly. We do have 'hv-passthrough' mode enabling
>> >> everything but it can't be used in production as it prevents migration.
>> >> 
>> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
>> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
>> >> compat_props mechanism will be used to disable them for legacy machine types,
>> >> this will keep 'hv-default=on' configurations migratable.
>> >> 
>> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> >> ---
>> >>  docs/hyperv.txt   | 16 +++++++++++++---
>> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
>> >>  target/i386/cpu.h |  5 +++++
>> >>  3 files changed, 56 insertions(+), 3 deletions(-)
>> >> 
>> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> >> index 5df00da54fc4..a54c066cab09 100644
>> >> --- a/docs/hyperv.txt
>> >> +++ b/docs/hyperv.txt
>> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
>> >>  
>> >>  2. Setup
>> >>  =========
>> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
>> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
>> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
>> >> +'hv-default=on' CPU flag:
>> >>  
>> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
>> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
>> >> +
>> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
>> >> +e.g:
>> >> +
>> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...  
>> >
>> > I'd put here not '...' but rather recommended list of flags, and update
>> > it every time when new feature added if necessary.
>> >  
>
> 1)
>  
>> This is an example of fine-grained enablement, there is no point to put
>> all the existing flags there (hv-default is the only recommended way
>> now, the rest is 'expert'/'debugging').
> so users are kept in dark what hv-default disables/enables (and it might depend
> on machine version on top that). Doesn't look like a good documentation to me
> (sure everyone can go and read source code for it and try to figure out how
> it's supposed to work)

'hv-default' enables *all* currently supported enlightenments. When
using with an old machine type, it will enable *all* Hyper-V
enlightenmnets which were supported when the corresponding machine type
was released. I don't think we document all other cases when a machine
type is modified (i.e. where can I read how pc-q35-5.1 is different from
pc-q35-5.0 if I refuse to read the source code?)

>
>>
>> > (not to mention that if we had it to begin with, then new 'hv-default' won't
>> > be necessary, I still see it as functionality duplication but I will not oppose it)
>> >  
>> 
>> Unfortunately, upper layer tools don't read this doc and update
>> themselves to enable new features when they appear.
> rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
> why stop there, just merge with yet another upper layer, it would save us a lot
> on communication protocols and simplify VM creation even more,
> and no one will have to read docs and write anything new on top.)
> There should be limit somewhere, where QEMU job ends and others pile hw abstraction
> layers on top of it.

We have '-machine q35' and we don't require to list all the devices from
it. We have '-cpu Skylake-Server' and we don't require to configure all
the features manually. Why can't we have similar enablement for Hyper-V
emulation where we can't even see a real need for anything but 'enable
everything' option?

There is no 'one libvirt to rule them all' (fortunately or
unfortunately). And sometimes QEMU is the uppermost layer and there's no
'libvirt' on top of it, this is also a perfectly valid use-case.

>
>> Similarly, if when these tools use '-machine q35' they get all the new features we add
>> automatically, right?
> it depends, in case of CPUs, new features usually 'off' by default
> for existing models. In case of bugs, features sometimes could be
> flipped and versioned machines were used to keep broken CPU models
> on old machine types.
>

That's why I was saying that Hyper-V enlightenments hardly resemble
'hardware' CPU features.

>    
>> >> +It is also possible to disable individual enlightenments from the default list,
>> >> +this can be used for debugging purposes:
>> >> +
>> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
>> >>  
>> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
>> >>  check that the supplied configuration is sane.
>> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> >> index 48007a876e32..99338de00f78 100644
>> >> --- a/target/i386/cpu.c
>> >> +++ b/target/i386/cpu.c
>> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
>> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
>> >>  }
>> >>  
>> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
>> >> +{
>> >> +    X86CPU *cpu = X86_CPU(obj);
>> >> +
>> >> +    return cpu->hyperv_default;
>> >> +}
>> >> +
>> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
>> >> +{
>> >> +    X86CPU *cpu = X86_CPU(obj);
>> >> +
>> >> +    cpu->hyperv_default = value;
>> >> +
>> >> +    if (value) {
>> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;  
>> >
>> > s/|="/=/ please,
>> > i.e. no option overrides whatever was specified before to keep semantics consistent.
>> >  
>> 
>> Hm,
>> 
>
>> this doesn't matter for the most recent machine type as
>> hyperv_default_features has all the features but imagine you're running
>> an older machine type which doesn't have 'hv_feature'. Now your
> normally one shouldn't use new feature with old machine type as it makes
> VM non-migratable to older QEMU that has this machine type but not this feature.
>
> nitpicking:
>   according to (1) user should not use 'hv_feature' on old machine since
>   hv_default should cover all their needs (well they don't know what
> hv_default actually is).

Normally yes but I can imagine sticking to some old machine type for
other-than-hyperv-enlightenments purposes and still wanting to add a
newly introduced enlightenment. Migration is not always a must.

>
>> suggestion is 
>> 
>> if I do:
>> 
>> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
>> 
>> but if I do
>> 
>> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
>> (as hv_default enablement will overwrite everything)
>> 
>> How is this consistent?
> usual semantics for properties, is that the latest property overwrites,
> the previous property value parsed from left to right.
> (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
> if one needs more than that one should add more related features after that.
>

This semantics probably doesn't apply to 'hv-default' case IMO as my
brain refuses to accept the fact that

'hv_default,hv_feature' != 'hv_feature,hv_default'

which should express the same desire 'the default set PLUS the feature I
want'.

I think I prefer sanity over purity in this case.

>
>> >> +    }
>> >> +}
>> >> +
>> >>  /* Generic getter for "feature-words" and "filtered-features" properties */
>> >>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>> >>                                        const char *name, void *opaque,
>> >> @@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
>> >>      object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
>> >>      object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
>> >>      object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
>> >> +    object_property_add_alias(obj, "hv_default", obj, "hv-default");
>> >>  
>> >>      if (xcc->model) {
>> >>          x86_cpu_load_model(cpu, xcc->model);
>> >>      }
>> >> +
>> >> +    /* Hyper-V features enabled with 'hv-default=on' */
>> >> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
>> >> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
>> >> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
>> >> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
>> >> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
>> >> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
>> >> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
>> >> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
>> >> +
>> >> +    /* Enlightened VMCS is only available on Intel/VMX */
>> >> +    if (kvm_hv_evmcs_available()) {
>> >> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
>> >> +    }  
>> > what if VVM is migrated to another host without evmcs,
>> > will it change CPUID?
>> >  
>> 
>> Evmcs is tightly coupled with VMX, we can't migrate when it's not
>> there.
>
> Are you saying mgmt will check and refuse to migrate to such host?
>

Is it possible to migrate a VM from a VMX-enabled host to a VMX-disabled
one if VMX feature was exposed to the VM? Probably not, you will fail to
create a VM on the destination host. Evmcs doesn't change anything in
this regard, there are no hosts where VMX is available but EVMCS is not.

>> 
>> >>  }
>> >>  
>> >>  static int64_t x86_cpu_get_arch_id(CPUState *cs)
>> >> @@ -7285,6 +7319,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
>> >>                                x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
>> >>  #endif
>> >>  
>> >> +    object_class_property_add_bool(oc, "hv-default",
>> >> +                              x86_hv_default_get,
>> >> +                              x86_hv_default_set);
>> >> +
>> >>      for (w = 0; w < FEATURE_WORDS; w++) {
>> >>          int bitnr;
>> >>          for (bitnr = 0; bitnr < 64; bitnr++) {
>> >> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> >> index 6220cb2cabb9..8a484becb6b9 100644
>> >> --- a/target/i386/cpu.h
>> >> +++ b/target/i386/cpu.h
>> >> @@ -1657,6 +1657,11 @@ struct X86CPU {
>> >>      bool hyperv_synic_kvm_only;
>> >>      uint64_t hyperv_features;
>> >>      bool hyperv_passthrough;
>> >> +
>> >> +    /* 'hv-default' enablement */
>> >> +    uint64_t hyperv_default_features;
>> >> +    bool hyperv_default;
>> >> +
>> >>      OnOffAuto hyperv_no_nonarch_cs;
>> >>      uint32_t hyperv_vendor_id[3];
>> >>      uint32_t hyperv_interface_id[4];  
>> >  
>> 
>

-- 
Vitaly



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-20 14:38         ` Vitaly Kuznetsov
@ 2021-01-20 19:08           ` Igor Mammedov
  2021-01-20 20:49             ` Eduardo Habkost
  2021-01-21  8:45             ` Vitaly Kuznetsov
  0 siblings, 2 replies; 34+ messages in thread
From: Igor Mammedov @ 2021-01-20 19:08 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

On Wed, 20 Jan 2021 15:38:33 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Fri, 15 Jan 2021 10:20:23 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Igor Mammedov <imammedo@redhat.com> writes:
> >>   
> >> > On Thu,  7 Jan 2021 16:14:49 +0100
> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >> >    
> >> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> >> >> requires listing all currently supported enlightenments ("hv-*" CPU
> >> >> features) explicitly. We do have 'hv-passthrough' mode enabling
> >> >> everything but it can't be used in production as it prevents migration.
> >> >> 
> >> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> >> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> >> >> compat_props mechanism will be used to disable them for legacy machine types,
> >> >> this will keep 'hv-default=on' configurations migratable.
> >> >> 
> >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> >> ---
> >> >>  docs/hyperv.txt   | 16 +++++++++++++---
> >> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
> >> >>  target/i386/cpu.h |  5 +++++
> >> >>  3 files changed, 56 insertions(+), 3 deletions(-)
> >> >> 
> >> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> >> >> index 5df00da54fc4..a54c066cab09 100644
> >> >> --- a/docs/hyperv.txt
> >> >> +++ b/docs/hyperv.txt
> >> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
> >> >>  
> >> >>  2. Setup
> >> >>  =========
> >> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> >> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> >> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
> >> >> +'hv-default=on' CPU flag:
> >> >>  
> >> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> >> >> +
> >> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> >> >> +e.g:
> >> >> +
> >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...    
> >> >
> >> > I'd put here not '...' but rather recommended list of flags, and update
> >> > it every time when new feature added if necessary.
> >> >    
> >
> > 1)
> >    
> >> This is an example of fine-grained enablement, there is no point to put
> >> all the existing flags there (hv-default is the only recommended way
> >> now, the rest is 'expert'/'debugging').  
> > so users are kept in dark what hv-default disables/enables (and it might depend
> > on machine version on top that). Doesn't look like a good documentation to me
> > (sure everyone can go and read source code for it and try to figure out how
> > it's supposed to work)  
> 
> 'hv-default' enables *all* currently supported enlightenments. When
> using with an old machine type, it will enable *all* Hyper-V
> enlightenmnets which were supported when the corresponding machine type
> was released. I don't think we document all other cases when a machine
> type is modified (i.e. where can I read how pc-q35-5.1 is different from
> pc-q35-5.0 if I refuse to read the source code?)
> 
> >  
> >>  
> >> > (not to mention that if we had it to begin with, then new 'hv-default' won't
> >> > be necessary, I still see it as functionality duplication but I will not oppose it)
> >> >    
> >> 
> >> Unfortunately, upper layer tools don't read this doc and update
> >> themselves to enable new features when they appear.  
> > rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
> > why stop there, just merge with yet another upper layer, it would save us a lot
> > on communication protocols and simplify VM creation even more,
> > and no one will have to read docs and write anything new on top.)
> > There should be limit somewhere, where QEMU job ends and others pile hw abstraction
> > layers on top of it.  
> 
> We have '-machine q35' and we don't require to list all the devices from
> it. We have '-cpu Skylake-Server' and we don't require to configure all
> the features manually. Why can't we have similar enablement for Hyper-V
> emulation where we can't even see a real need for anything but 'enable
> everything' option?
> 
> There is no 'one libvirt to rule them all' (fortunately or
> unfortunately). And sometimes QEMU is the uppermost layer and there's no
> 'libvirt' on top of it, this is also a perfectly valid use-case.
> 
> >  
> >> Similarly, if when these tools use '-machine q35' they get all the new features we add
> >> automatically, right?  
> > it depends, in case of CPUs, new features usually 'off' by default
> > for existing models. In case of bugs, features sometimes could be
> > flipped and versioned machines were used to keep broken CPU models
> > on old machine types.
> >  
> 
> That's why I was saying that Hyper-V enlightenments hardly resemble
> 'hardware' CPU features.
Well, Microsoft chose to implement them as hardware concept (CPUID leaf),
and I prefer to treat them the same way as any other CPUID bits.

> 
> >      
> >> >> +It is also possible to disable individual enlightenments from the default list,
> >> >> +this can be used for debugging purposes:
> >> >> +
> >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
> >> >>  
> >> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
> >> >>  check that the supplied configuration is sane.
> >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> >> index 48007a876e32..99338de00f78 100644
> >> >> --- a/target/i386/cpu.c
> >> >> +++ b/target/i386/cpu.c
> >> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
> >> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
> >> >>  }
> >> >>  
> >> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
> >> >> +{
> >> >> +    X86CPU *cpu = X86_CPU(obj);
> >> >> +
> >> >> +    return cpu->hyperv_default;
> >> >> +}
> >> >> +
> >> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> >> >> +{
> >> >> +    X86CPU *cpu = X86_CPU(obj);
> >> >> +
> >> >> +    cpu->hyperv_default = value;
> >> >> +
> >> >> +    if (value) {
> >> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;    
> >> >
> >> > s/|="/=/ please,
> >> > i.e. no option overrides whatever was specified before to keep semantics consistent.
> >> >    
> >> 
> >> Hm,
> >>   
> >  
> >> this doesn't matter for the most recent machine type as
> >> hyperv_default_features has all the features but imagine you're running
> >> an older machine type which doesn't have 'hv_feature'. Now your  
> > normally one shouldn't use new feature with old machine type as it makes
> > VM non-migratable to older QEMU that has this machine type but not this feature.
> >
> > nitpicking:
> >   according to (1) user should not use 'hv_feature' on old machine since
> >   hv_default should cover all their needs (well they don't know what
> > hv_default actually is).  
> 
> Normally yes but I can imagine sticking to some old machine type for
> other-than-hyperv-enlightenments purposes and still wanting to add a
> newly introduced enlightenment. Migration is not always a must.
> 
> >  
> >> suggestion is 
> >> 
> >> if I do:
> >> 
> >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
> >> 
> >> but if I do
> >> 
> >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
> >> (as hv_default enablement will overwrite everything)
> >> 
> >> How is this consistent?  
> > usual semantics for properties, is that the latest property overwrites,
> > the previous property value parsed from left to right.
> > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
> > if one needs more than that one should add more related features after that.
> >  
> 
> This semantics probably doesn't apply to 'hv-default' case IMO as my
> brain refuses to accept the fact that
it's difficult probably because 'hv-default' is 'alias' property 
that covers all individual hv-foo features in one go and that individual
features are exposed to user, but otherwise it is just a property that
sets CPUID features or like any other property, and should be treated like such.

> 'hv_default,hv_feature' != 'hv_feature,hv_default'
>
> which should express the same desire 'the default set PLUS the feature I
> want'.
if hv_default were touching different data, I'd agree.
But in the end hv_default boils down to the same CPUID bits as individual
features:

  hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
         !=
  hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)

 
> I think I prefer sanity over purity in this case.
what is sanity to one could be insanity for another,
so I pointed out the way properties expected to work today.

But you are adding new semantic ('combine') to property/features parsing
(instead of current 'set' policy), and users will have to be aware of
this new behavior and add/maintain code for this special case.
(maybe I worry in vain, and no one will read docs and know about this
new property anyways)

That will also push x86 CPUs consolidation farther away from other targets,
where there aren't any special casing for features parsing, just simple
left to right parsing with the latest property having overwriting previously
set value.
We are trying hard to reduce special cases and unify interfaces for same
components to simplify qemu and make it predictable/easier for users.


> >> >> +    }
> >> >> +}
> >> >> +
> >> >>  /* Generic getter for "feature-words" and "filtered-features" properties */
> >> >>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
> >> >>                                        const char *name, void *opaque,
> >> >> @@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
> >> >>      object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
> >> >>      object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
> >> >>      object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
> >> >> +    object_property_add_alias(obj, "hv_default", obj, "hv-default");
> >> >>  
> >> >>      if (xcc->model) {
> >> >>          x86_cpu_load_model(cpu, xcc->model);
> >> >>      }
> >> >> +
> >> >> +    /* Hyper-V features enabled with 'hv-default=on' */
> >> >> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
> >> >> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
> >> >> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
> >> >> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
> >> >> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
> >> >> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
> >> >> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
> >> >> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
> >> >> +
> >> >> +    /* Enlightened VMCS is only available on Intel/VMX */
> >> >> +    if (kvm_hv_evmcs_available()) {
> >> >> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
> >> >> +    }    
> >> > what if VVM is migrated to another host without evmcs,
> >> > will it change CPUID?
> >> >    
> >> 
> >> Evmcs is tightly coupled with VMX, we can't migrate when it's not
> >> there.  
> >
> > Are you saying mgmt will check and refuse to migrate to such host?
> >  
> 
> Is it possible to migrate a VM from a VMX-enabled host to a VMX-disabled
> one if VMX feature was exposed to the VM? Probably not, you will fail to
> create a VM on the destination host. Evmcs doesn't change anything in
> this regard, there are no hosts where VMX is available but EVMCS is not.

I'm not sure how evmcs should be handled,
can you point out what in this series makes sure that migration fails or
makes qemu not able to start in case kvm_hv_evmcs_available() returns false.

So far I read snippet above as a problem:
1:
  host supports evmcs:
  and exposes HYPERV_FEAT_EVMCS in CPUID
2: we migrate to host without evmcs
2.1 start target QEMU, it happily creates vCPUs without HYPERV_FEAT_EVMCS in CPUID
2.2 if I'm not mistaken CPUID is not part of migration stream,
    nothing could check and fail migration
2.3 guest runs fine till it tries to use non existing feature, ...


> >> >>  }
> >> >>  
> >> >>  static int64_t x86_cpu_get_arch_id(CPUState *cs)
> >> >> @@ -7285,6 +7319,10 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
> >> >>                                x86_cpu_get_crash_info_qom, NULL, NULL, NULL);
> >> >>  #endif
> >> >>  
> >> >> +    object_class_property_add_bool(oc, "hv-default",
> >> >> +                              x86_hv_default_get,
> >> >> +                              x86_hv_default_set);
> >> >> +
> >> >>      for (w = 0; w < FEATURE_WORDS; w++) {
> >> >>          int bitnr;
> >> >>          for (bitnr = 0; bitnr < 64; bitnr++) {
> >> >> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> >> >> index 6220cb2cabb9..8a484becb6b9 100644
> >> >> --- a/target/i386/cpu.h
> >> >> +++ b/target/i386/cpu.h
> >> >> @@ -1657,6 +1657,11 @@ struct X86CPU {
> >> >>      bool hyperv_synic_kvm_only;
> >> >>      uint64_t hyperv_features;
> >> >>      bool hyperv_passthrough;
> >> >> +
> >> >> +    /* 'hv-default' enablement */
> >> >> +    uint64_t hyperv_default_features;
> >> >> +    bool hyperv_default;
> >> >> +
> >> >>      OnOffAuto hyperv_no_nonarch_cs;
> >> >>      uint32_t hyperv_vendor_id[3];
> >> >>      uint32_t hyperv_interface_id[4];    
> >> >    
> >>   
> >  
> 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-20 13:13       ` Igor Mammedov
  2021-01-20 14:38         ` Vitaly Kuznetsov
@ 2021-01-20 19:55         ` Eduardo Habkost
  1 sibling, 0 replies; 34+ messages in thread
From: Eduardo Habkost @ 2021-01-20 19:55 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Marcelo Tosatti, qemu-devel

On Wed, Jan 20, 2021 at 02:13:12PM +0100, Igor Mammedov wrote:
> On Fri, 15 Jan 2021 10:20:23 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> 
> > Igor Mammedov <imammedo@redhat.com> writes:
> > 
> > > On Thu,  7 Jan 2021 16:14:49 +0100
> > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
[...]
> > >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> > >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> > >> +All currently supported Hyper-V enlightenments can be enabled by specifying
> > >> +'hv-default=on' CPU flag:
> > >>  
> > >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> > >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> > >> +
> > >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> > >> +e.g:
> > >> +
> > >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...  
> > >
> > > I'd put here not '...' but rather recommended list of flags, and update
> > > it every time when new feature added if necessary.
> > >  
> 
> 1)
>  
> > This is an example of fine-grained enablement, there is no point to put
> > all the existing flags there (hv-default is the only recommended way
> > now, the rest is 'expert'/'debugging').
> so users are kept in dark what hv-default disables/enables (and it might depend
> on machine version on top that). Doesn't look like a good documentation to me
> (sure everyone can go and read source code for it and try to figure out how
> it's supposed to work)

Why is this a problem?  This is not different from CPU feature
flags hidden by CPU model names.  Or virtio feature bits hidden
by virtio devices and machine type compat code.


> > > (not to mention that if we had it to begin with, then new 'hv-default' won't
> > > be necessary, I still see it as functionality duplication but I will not oppose it)
> > >  
> > 
> > Unfortunately, upper layer tools don't read this doc and update
> > themselves to enable new features when they appear.
> rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
> why stop there, just merge with yet another upper layer, it would save us a lot
> on communication protocols and simplify VM creation even more,
> and no one will have to read docs and write anything new on top.)
> There should be limit somewhere, where QEMU job ends and others pile hw abstraction
> layers on top of it.

If there should be a limit somewhere, I'd say low level hyperv
flags are a very long way from crossing that limit.  It is
completely reasonable to provide a higher level knob for them in
QEMU.  Vitaly is trying to solve a real problem here, because the
existing solution is not working.

We only need to offer more complex and lower level interfaces if
we have to.  I don't think we have real world examples where
libvirt or management software developers/users are asking for
low level hyperv feature flag knobs, do we?

> 
> > Similarly, if when these tools use '-machine q35' they get all the new features we add
> > automatically, right?
> it depends, in case of CPUs, new features usually 'off' by default
> for existing models. In case of bugs, features sometimes could be
> flipped and versioned machines were used to keep broken CPU models
> on old machine types.
> 
>    
[...]

-- 
Eduardo



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-20 19:08           ` Igor Mammedov
@ 2021-01-20 20:49             ` Eduardo Habkost
  2021-01-21 13:27               ` Igor Mammedov
  2021-01-21  8:45             ` Vitaly Kuznetsov
  1 sibling, 1 reply; 34+ messages in thread
From: Eduardo Habkost @ 2021-01-20 20:49 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Marcelo Tosatti, qemu-devel

On Wed, Jan 20, 2021 at 08:08:32PM +0100, Igor Mammedov wrote:
> On Wed, 20 Jan 2021 15:38:33 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> 
> > Igor Mammedov <imammedo@redhat.com> writes:
> > 
> > > On Fri, 15 Jan 2021 10:20:23 +0100
> > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > >  
> > >> Igor Mammedov <imammedo@redhat.com> writes:
> > >>   
> > >> > On Thu,  7 Jan 2021 16:14:49 +0100
> > >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > >> >    
> > >> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> > >> >> requires listing all currently supported enlightenments ("hv-*" CPU
> > >> >> features) explicitly. We do have 'hv-passthrough' mode enabling
> > >> >> everything but it can't be used in production as it prevents migration.
> > >> >> 
> > >> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> > >> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> > >> >> compat_props mechanism will be used to disable them for legacy machine types,
> > >> >> this will keep 'hv-default=on' configurations migratable.
> > >> >> 
> > >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > >> >> ---
> > >> >>  docs/hyperv.txt   | 16 +++++++++++++---
> > >> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
> > >> >>  target/i386/cpu.h |  5 +++++
> > >> >>  3 files changed, 56 insertions(+), 3 deletions(-)
> > >> >> 
> > >> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> > >> >> index 5df00da54fc4..a54c066cab09 100644
> > >> >> --- a/docs/hyperv.txt
> > >> >> +++ b/docs/hyperv.txt
> > >> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
> > >> >>  
> > >> >>  2. Setup
> > >> >>  =========
> > >> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> > >> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> > >> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
> > >> >> +'hv-default=on' CPU flag:
> > >> >>  
> > >> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> > >> >> +
> > >> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> > >> >> +e.g:
> > >> >> +
> > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...    
> > >> >
> > >> > I'd put here not '...' but rather recommended list of flags, and update
> > >> > it every time when new feature added if necessary.
> > >> >    
> > >
> > > 1)
> > >    
> > >> This is an example of fine-grained enablement, there is no point to put
> > >> all the existing flags there (hv-default is the only recommended way
> > >> now, the rest is 'expert'/'debugging').  
> > > so users are kept in dark what hv-default disables/enables (and it might depend
> > > on machine version on top that). Doesn't look like a good documentation to me
> > > (sure everyone can go and read source code for it and try to figure out how
> > > it's supposed to work)  
> > 
> > 'hv-default' enables *all* currently supported enlightenments. When
> > using with an old machine type, it will enable *all* Hyper-V
> > enlightenmnets which were supported when the corresponding machine type
> > was released. I don't think we document all other cases when a machine
> > type is modified (i.e. where can I read how pc-q35-5.1 is different from
> > pc-q35-5.0 if I refuse to read the source code?)
> > 
> > >  
> > >>  
> > >> > (not to mention that if we had it to begin with, then new 'hv-default' won't
> > >> > be necessary, I still see it as functionality duplication but I will not oppose it)
> > >> >    
> > >> 
> > >> Unfortunately, upper layer tools don't read this doc and update
> > >> themselves to enable new features when they appear.  
> > > rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
> > > why stop there, just merge with yet another upper layer, it would save us a lot
> > > on communication protocols and simplify VM creation even more,
> > > and no one will have to read docs and write anything new on top.)
> > > There should be limit somewhere, where QEMU job ends and others pile hw abstraction
> > > layers on top of it.  
> > 
> > We have '-machine q35' and we don't require to list all the devices from
> > it. We have '-cpu Skylake-Server' and we don't require to configure all
> > the features manually. Why can't we have similar enablement for Hyper-V
> > emulation where we can't even see a real need for anything but 'enable
> > everything' option?
> > 
> > There is no 'one libvirt to rule them all' (fortunately or
> > unfortunately). And sometimes QEMU is the uppermost layer and there's no
> > 'libvirt' on top of it, this is also a perfectly valid use-case.
> > 
> > >  
> > >> Similarly, if when these tools use '-machine q35' they get all the new features we add
> > >> automatically, right?  
> > > it depends, in case of CPUs, new features usually 'off' by default
> > > for existing models. In case of bugs, features sometimes could be
> > > flipped and versioned machines were used to keep broken CPU models
> > > on old machine types.
> > >  
> > 
> > That's why I was saying that Hyper-V enlightenments hardly resemble
> > 'hardware' CPU features.
> Well, Microsoft chose to implement them as hardware concept (CPUID leaf),
> and I prefer to treat them the same way as any other CPUID bits.
> 
> > 
> > >      
> > >> >> +It is also possible to disable individual enlightenments from the default list,
> > >> >> +this can be used for debugging purposes:
> > >> >> +
> > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
> > >> >>  
> > >> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
> > >> >>  check that the supplied configuration is sane.
> > >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > >> >> index 48007a876e32..99338de00f78 100644
> > >> >> --- a/target/i386/cpu.c
> > >> >> +++ b/target/i386/cpu.c
> > >> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
> > >> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
> > >> >>  }
> > >> >>  
> > >> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
> > >> >> +{
> > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > >> >> +
> > >> >> +    return cpu->hyperv_default;
> > >> >> +}
> > >> >> +
> > >> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> > >> >> +{
> > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > >> >> +
> > >> >> +    cpu->hyperv_default = value;
> > >> >> +
> > >> >> +    if (value) {
> > >> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;    
> > >> >
> > >> > s/|="/=/ please,
> > >> > i.e. no option overrides whatever was specified before to keep semantics consistent.
> > >> >    
> > >> 
> > >> Hm,
> > >>   
> > >  
> > >> this doesn't matter for the most recent machine type as
> > >> hyperv_default_features has all the features but imagine you're running
> > >> an older machine type which doesn't have 'hv_feature'. Now your  
> > > normally one shouldn't use new feature with old machine type as it makes
> > > VM non-migratable to older QEMU that has this machine type but not this feature.
> > >
> > > nitpicking:
> > >   according to (1) user should not use 'hv_feature' on old machine since
> > >   hv_default should cover all their needs (well they don't know what
> > > hv_default actually is).  
> > 
> > Normally yes but I can imagine sticking to some old machine type for
> > other-than-hyperv-enlightenments purposes and still wanting to add a
> > newly introduced enlightenment. Migration is not always a must.
> > 
> > >  
> > >> suggestion is 
> > >> 
> > >> if I do:
> > >> 
> > >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
> > >> 
> > >> but if I do
> > >> 
> > >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
> > >> (as hv_default enablement will overwrite everything)
> > >> 
> > >> How is this consistent?  
> > > usual semantics for properties, is that the latest property overwrites,
> > > the previous property value parsed from left to right.
> > > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
> > > if one needs more than that one should add more related features after that.
> > >  
> > 
> > This semantics probably doesn't apply to 'hv-default' case IMO as my
> > brain refuses to accept the fact that
> it's difficult probably because 'hv-default' is 'alias' property 
> that covers all individual hv-foo features in one go and that individual
> features are exposed to user, but otherwise it is just a property that
> sets CPUID features or like any other property, and should be treated like such.
> 
> > 'hv_default,hv_feature' != 'hv_feature,hv_default'
> >
> > which should express the same desire 'the default set PLUS the feature I
> > want'.
> if hv_default were touching different data, I'd agree.
> But in the end hv_default boils down to the same CPUID bits as individual
> features:
> 
>   hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
>          !=
>   hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)

I don't know why you chose to define "hv_default" as
hv_f1=on,hv_f2=off.  If hv_f2 is not enabled by hv_default, it
doesn't need to be touched by hv_default at all.


> 
>  
> > I think I prefer sanity over purity in this case.
> what is sanity to one could be insanity for another,
> so I pointed out the way properties expected to work today.
> 
> But you are adding new semantic ('combine') to property/features parsing
> (instead of current 'set' policy), and users will have to be aware of
> this new behavior and add/maintain code for this special case.
> (maybe I worry in vain, and no one will read docs and know about this
> new property anyways)
> 
> That will also push x86 CPUs consolidation farther away from other targets,
> where there aren't any special casing for features parsing, just simple
> left to right parsing with the latest property having overwriting previously
> set value.
> We are trying hard to reduce special cases and unify interfaces for same
> components to simplify qemu and make it predictable/easier for users.
> 

What you are proposing diverges from other targets, actually.
See target/s390x/cpu_models.c:set_feature_group() for example.
Enabling a feature group in s390x only enables a set of feature
bits, and doesn't touch the rest.

In other words, if hv_default includes hv_f1+hv_f2 (and not hv_f3
or hv_f4), this means:

   hv_default,hv_f3=on,hv_f4=off => (hv_f1=on,hv_f2=on),hv_f3=on,hv_f4=off
          ==
   hv_f3=on,hv_f4=off,hv_default => hv_f3=on,hv_f4=off,(hv_f2=on,hv_f2=on)

That would also mean:

   hv_default,hv_f1=on,hv_f2=off => (hv_f1=on,hv_f2=on),hv_f1=on,hv_f2=off
          !=
   hv_f1=on,hv_f2=off,hv_default => hv_f1=on,hv_f2=off,(hv_f2=on,hv_f2=on)

That's the behavior implemented by Vitaly.

> [...]

-- 
Eduardo



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-20 19:08           ` Igor Mammedov
  2021-01-20 20:49             ` Eduardo Habkost
@ 2021-01-21  8:45             ` Vitaly Kuznetsov
  2021-01-21 13:49               ` Igor Mammedov
  1 sibling, 1 reply; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-21  8:45 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Wed, 20 Jan 2021 15:38:33 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Fri, 15 Jan 2021 10:20:23 +0100
>> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> >  
>> >> suggestion is 
>> >> 
>> >> if I do:
>> >> 
>> >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
>> >> 
>> >> but if I do
>> >> 
>> >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
>> >> (as hv_default enablement will overwrite everything)
>> >> 
>> >> How is this consistent?  
>> > usual semantics for properties, is that the latest property overwrites,
>> > the previous property value parsed from left to right.
>> > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
>> > if one needs more than that one should add more related features after that.
>> >  
>> 
>> This semantics probably doesn't apply to 'hv-default' case IMO as my
>> brain refuses to accept the fact that
> it's difficult probably because 'hv-default' is 'alias' property 
> that covers all individual hv-foo features in one go and that individual
> features are exposed to user, but otherwise it is just a property that
> sets CPUID features or like any other property, and should be treated
> like such.
>
>> 'hv_default,hv_feature' != 'hv_feature,hv_default'
>>
>> which should express the same desire 'the default set PLUS the feature I
>> want'.
> if hv_default were touching different data, I'd agree.
> But in the end hv_default boils down to the same CPUID bits as individual
> features:
>
>   hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
>          !=
>   hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)
>

In your case I treat 'hv_default' as 'hv_f1=on' and it says nothing
about 'hv_f2' - neither it is enabled, nor it is disabled because when
the corresponding machine type was released it just wasn't there.

>  
>> I think I prefer sanity over purity in this case.
> what is sanity to one could be insanity for another,
> so I pointed out the way properties expected to work today.
>
> But you are adding new semantic ('combine') to property/features parsing
> (instead of current 'set' policy), and users will have to be aware of
> this new behavior and add/maintain code for this special case.
> (maybe I worry in vain, and no one will read docs and know about this
> new property anyways)
>
> That will also push x86 CPUs consolidation farther away from other targets,
> where there aren't any special casing for features parsing, just simple
> left to right parsing with the latest property having overwriting previously
> set value.

In case this is somewhat important I suggest we get back to adding
'hyperv=on' machine type option and not do the 'aliasing' with
'hv_default'. I think it would be possible to support

'-M q35,hyper=on -cpu host,hv-stimer-direct=off' 

even if we need to add a custom handler for Hyper-V feature setting
instead of just using bits in u64 as we need to remember both what was
enabled and what was disabled to combine this with machine type property
correctly.

> We are trying hard to reduce special cases and unify interfaces for same
> components to simplify qemu and make it predictable/easier for users.
>

That's exactly the reason why we need simpler Hyper-V feature
enablement! :-)

>
>> >> >> +    }
>> >> >> +}
>> >> >> +
>> >> >>  /* Generic getter for "feature-words" and "filtered-features" properties */
>> >> >>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
>> >> >>                                        const char *name, void *opaque,
>> >> >> @@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
>> >> >>      object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
>> >> >>      object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
>> >> >>      object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
>> >> >> +    object_property_add_alias(obj, "hv_default", obj, "hv-default");
>> >> >>  
>> >> >>      if (xcc->model) {
>> >> >>          x86_cpu_load_model(cpu, xcc->model);
>> >> >>      }
>> >> >> +
>> >> >> +    /* Hyper-V features enabled with 'hv-default=on' */
>> >> >> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
>> >> >> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
>> >> >> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
>> >> >> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
>> >> >> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
>> >> >> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
>> >> >> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
>> >> >> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
>> >> >> +
>> >> >> +    /* Enlightened VMCS is only available on Intel/VMX */
>> >> >> +    if (kvm_hv_evmcs_available()) {
>> >> >> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
>> >> >> +    }    
>> >> > what if VVM is migrated to another host without evmcs,
>> >> > will it change CPUID?
>> >> >    
>> >> 
>> >> Evmcs is tightly coupled with VMX, we can't migrate when it's not
>> >> there.  
>> >
>> > Are you saying mgmt will check and refuse to migrate to such host?
>> >  
>> 
>> Is it possible to migrate a VM from a VMX-enabled host to a VMX-disabled
>> one if VMX feature was exposed to the VM? Probably not, you will fail to
>> create a VM on the destination host. Evmcs doesn't change anything in
>> this regard, there are no hosts where VMX is available but EVMCS is not.
>
> I'm not sure how evmcs should be handled,
> can you point out what in this series makes sure that migration fails or
> makes qemu not able to start in case kvm_hv_evmcs_available() returns false.
>
> So far I read snippet above as a problem:
> 1:
>   host supports evmcs:
>   and exposes HYPERV_FEAT_EVMCS in CPUID

Host with EVMCS is Intel

> 2: we migrate to host without evmcs

Host without EVMCS is AMD, there are no other options. It is a pure
software feature available for KVM-intel. And if your KVM is so old that
it doesn't know anything about EVMCS, a bunch of other options from
'hv-default' will not start as well.

> 2.1 start target QEMU, it happily creates vCPUs without
> HYPERV_FEAT_EVMCS in CPUID

No, it doesn't as on host1 we had at least VMX CPU feature enabled (or a
CPU model implying it) to make this all work.

> 2.2 if I'm not mistaken CPUID is not part of migration stream,
>     nothing could check and fail migration
> 2.3 guest runs fine till it tries to use non existing feature, ..

I'm also very sceptical about possibilities for migration
Windows/Hyper-V VMs from Intel to AMD. Hyper-V doesn't even boot if you
don't have fresh-enough CPU so the common denominator for Intel/AMD
would definitely not work. 

-- 
Vitaly



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-20 20:49             ` Eduardo Habkost
@ 2021-01-21 13:27               ` Igor Mammedov
  2021-01-21 16:23                 ` Igor Mammedov
  2021-01-21 17:08                 ` Eduardo Habkost
  0 siblings, 2 replies; 34+ messages in thread
From: Igor Mammedov @ 2021-01-21 13:27 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Marcelo Tosatti, qemu-devel

On Wed, 20 Jan 2021 15:49:09 -0500
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Wed, Jan 20, 2021 at 08:08:32PM +0100, Igor Mammedov wrote:
> > On Wed, 20 Jan 2021 15:38:33 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >   
> > > Igor Mammedov <imammedo@redhat.com> writes:
> > >   
> > > > On Fri, 15 Jan 2021 10:20:23 +0100
> > > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > >    
> > > >> Igor Mammedov <imammedo@redhat.com> writes:
> > > >>     
> > > >> > On Thu,  7 Jan 2021 16:14:49 +0100
> > > >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > >> >      
> > > >> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> > > >> >> requires listing all currently supported enlightenments ("hv-*" CPU
> > > >> >> features) explicitly. We do have 'hv-passthrough' mode enabling
> > > >> >> everything but it can't be used in production as it prevents migration.
> > > >> >> 
> > > >> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> > > >> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> > > >> >> compat_props mechanism will be used to disable them for legacy machine types,
> > > >> >> this will keep 'hv-default=on' configurations migratable.
> > > >> >> 
> > > >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > >> >> ---
> > > >> >>  docs/hyperv.txt   | 16 +++++++++++++---
> > > >> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
> > > >> >>  target/i386/cpu.h |  5 +++++
> > > >> >>  3 files changed, 56 insertions(+), 3 deletions(-)
> > > >> >> 
> > > >> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> > > >> >> index 5df00da54fc4..a54c066cab09 100644
> > > >> >> --- a/docs/hyperv.txt
> > > >> >> +++ b/docs/hyperv.txt
> > > >> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
> > > >> >>  
> > > >> >>  2. Setup
> > > >> >>  =========
> > > >> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> > > >> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> > > >> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
> > > >> >> +'hv-default=on' CPU flag:
> > > >> >>  
> > > >> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> > > >> >> +
> > > >> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> > > >> >> +e.g:
> > > >> >> +
> > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...      
> > > >> >
> > > >> > I'd put here not '...' but rather recommended list of flags, and update
> > > >> > it every time when new feature added if necessary.
> > > >> >      
> > > >
> > > > 1)
> > > >      
> > > >> This is an example of fine-grained enablement, there is no point to put
> > > >> all the existing flags there (hv-default is the only recommended way
> > > >> now, the rest is 'expert'/'debugging').    
> > > > so users are kept in dark what hv-default disables/enables (and it might depend
> > > > on machine version on top that). Doesn't look like a good documentation to me
> > > > (sure everyone can go and read source code for it and try to figure out how
> > > > it's supposed to work)    
> > > 
> > > 'hv-default' enables *all* currently supported enlightenments. When
> > > using with an old machine type, it will enable *all* Hyper-V
> > > enlightenmnets which were supported when the corresponding machine type
> > > was released. I don't think we document all other cases when a machine
> > > type is modified (i.e. where can I read how pc-q35-5.1 is different from
> > > pc-q35-5.0 if I refuse to read the source code?)
> > >   
> > > >    
> > > >>    
> > > >> > (not to mention that if we had it to begin with, then new 'hv-default' won't
> > > >> > be necessary, I still see it as functionality duplication but I will not oppose it)
> > > >> >      
> > > >> 
> > > >> Unfortunately, upper layer tools don't read this doc and update
> > > >> themselves to enable new features when they appear.    
> > > > rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
> > > > why stop there, just merge with yet another upper layer, it would save us a lot
> > > > on communication protocols and simplify VM creation even more,
> > > > and no one will have to read docs and write anything new on top.)
> > > > There should be limit somewhere, where QEMU job ends and others pile hw abstraction
> > > > layers on top of it.    
> > > 
> > > We have '-machine q35' and we don't require to list all the devices from
> > > it. We have '-cpu Skylake-Server' and we don't require to configure all
> > > the features manually. Why can't we have similar enablement for Hyper-V
> > > emulation where we can't even see a real need for anything but 'enable
> > > everything' option?
> > > 
> > > There is no 'one libvirt to rule them all' (fortunately or
> > > unfortunately). And sometimes QEMU is the uppermost layer and there's no
> > > 'libvirt' on top of it, this is also a perfectly valid use-case.
> > >   
> > > >    
> > > >> Similarly, if when these tools use '-machine q35' they get all the new features we add
> > > >> automatically, right?    
> > > > it depends, in case of CPUs, new features usually 'off' by default
> > > > for existing models. In case of bugs, features sometimes could be
> > > > flipped and versioned machines were used to keep broken CPU models
> > > > on old machine types.
> > > >    
> > > 
> > > That's why I was saying that Hyper-V enlightenments hardly resemble
> > > 'hardware' CPU features.  
> > Well, Microsoft chose to implement them as hardware concept (CPUID leaf),
> > and I prefer to treat them the same way as any other CPUID bits.
> >   
> > >   
> > > >        
> > > >> >> +It is also possible to disable individual enlightenments from the default list,
> > > >> >> +this can be used for debugging purposes:
> > > >> >> +
> > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
> > > >> >>  
> > > >> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
> > > >> >>  check that the supplied configuration is sane.
> > > >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > >> >> index 48007a876e32..99338de00f78 100644
> > > >> >> --- a/target/i386/cpu.c
> > > >> >> +++ b/target/i386/cpu.c
> > > >> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
> > > >> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
> > > >> >>  }
> > > >> >>  
> > > >> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
> > > >> >> +{
> > > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > > >> >> +
> > > >> >> +    return cpu->hyperv_default;
> > > >> >> +}
> > > >> >> +
> > > >> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> > > >> >> +{
> > > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > > >> >> +
> > > >> >> +    cpu->hyperv_default = value;
> > > >> >> +
> > > >> >> +    if (value) {
> > > >> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;      
> > > >> >
> > > >> > s/|="/=/ please,
> > > >> > i.e. no option overrides whatever was specified before to keep semantics consistent.
> > > >> >      
> > > >> 
> > > >> Hm,
> > > >>     
> > > >    
> > > >> this doesn't matter for the most recent machine type as
> > > >> hyperv_default_features has all the features but imagine you're running
> > > >> an older machine type which doesn't have 'hv_feature'. Now your    
> > > > normally one shouldn't use new feature with old machine type as it makes
> > > > VM non-migratable to older QEMU that has this machine type but not this feature.
> > > >
> > > > nitpicking:
> > > >   according to (1) user should not use 'hv_feature' on old machine since
> > > >   hv_default should cover all their needs (well they don't know what
> > > > hv_default actually is).    
> > > 
> > > Normally yes but I can imagine sticking to some old machine type for
> > > other-than-hyperv-enlightenments purposes and still wanting to add a
> > > newly introduced enlightenment. Migration is not always a must.
> > >   
> > > >    
> > > >> suggestion is 
> > > >> 
> > > >> if I do:
> > > >> 
> > > >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
> > > >> 
> > > >> but if I do
> > > >> 
> > > >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
> > > >> (as hv_default enablement will overwrite everything)
> > > >> 
> > > >> How is this consistent?    
> > > > usual semantics for properties, is that the latest property overwrites,
> > > > the previous property value parsed from left to right.
> > > > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
> > > > if one needs more than that one should add more related features after that.
> > > >    
> > > 
> > > This semantics probably doesn't apply to 'hv-default' case IMO as my
> > > brain refuses to accept the fact that  
> > it's difficult probably because 'hv-default' is 'alias' property 
> > that covers all individual hv-foo features in one go and that individual
> > features are exposed to user, but otherwise it is just a property that
> > sets CPUID features or like any other property, and should be treated like such.
> >   
> > > 'hv_default,hv_feature' != 'hv_feature,hv_default'
> > >
> > > which should express the same desire 'the default set PLUS the feature I
> > > want'.  
> > if hv_default were touching different data, I'd agree.
> > But in the end hv_default boils down to the same CPUID bits as individual
> > features:
> > 
> >   hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
> >          !=
> >   hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)  
> 
> I don't know why you chose to define "hv_default" as
> hv_f1=on,hv_f2=off.  If hv_f2 is not enabled by hv_default, it
> doesn't need to be touched by hv_default at all.

Essentially I was thinking about hv_default=on as setting default value
of hv CPUID leaf i.e. like doc claims, 'all' hv_* features (including
turned off and unused bits) which always sets leaf to its default state.

Now lets consider following possible situation
using combine' approach (leaf |= some_bits):

QEMU-6.0: initially we have all possible features enabled
                hv_default = (hv_f1=on,hv_f2=on)

hv_f2=on,hv_default=on == hv_f1=on,hv_f2=on

QEMU-6.1: disabled hv_f2=off that was causing problems

hv_default = (hv_f1=on,hv_f2=off)

however due to ORing hv_default doesn't fix issue for the same CLI
(i.e. it doesn't have expected effect)

hv_f2=on,hv_default=on => hv_f1=on,hv_f2=on

if one would use usual 'set' semantics (leaf = all_bits),
then new hv_default value will have desired effect despite of botched CLI,
just by virtue of property following typical 'last set' semantics:

 => hv_f1=on,hv_f2=off

If we assume that we 'never ever' will need to disable feature bits
than it doesn't matter which approach to use, however a look at
pc_compat arrays shows that features are being enabled/disabled
all the time.

PS:
I'd rename hv_default => hv_set_default,
since we would need hv_default[_value] property later on to set compat value
based on machine type version.
    
> > > I think I prefer sanity over purity in this case.  
> > what is sanity to one could be insanity for another,
> > so I pointed out the way properties expected to work today.
> > 
> > But you are adding new semantic ('combine') to property/features parsing
> > (instead of current 'set' policy), and users will have to be aware of
> > this new behavior and add/maintain code for this special case.
> > (maybe I worry in vain, and no one will read docs and know about this
> > new property anyways)
> > 
> > That will also push x86 CPUs consolidation farther away from other targets,
> > where there aren't any special casing for features parsing, just simple
> > left to right parsing with the latest property having overwriting previously
> > set value.
> > We are trying hard to reduce special cases and unify interfaces for same
> > components to simplify qemu and make it predictable/easier for users.
> >   
> 
> What you are proposing diverges from other targets, actually.
> See target/s390x/cpu_models.c:set_feature_group() for example.
> Enabling a feature group in s390x only enables a set of feature
> bits, and doesn't touch the rest.
Looking at code, it has the same issue as I described above


> In other words, if hv_default includes hv_f1+hv_f2 (and not hv_f3
> or hv_f4), this means:
> 
>    hv_default,hv_f3=on,hv_f4=off => (hv_f1=on,hv_f2=on),hv_f3=on,hv_f4=off
>           ==
>    hv_f3=on,hv_f4=off,hv_default => hv_f3=on,hv_f4=off,(hv_f2=on,hv_f2=on)
> 
> That would also mean:
> 
>    hv_default,hv_f1=on,hv_f2=off => (hv_f1=on,hv_f2=on),hv_f1=on,hv_f2=off
>           !=
>    hv_f1=on,hv_f2=off,hv_default => hv_f1=on,hv_f2=off,(hv_f2=on,hv_f2=on)
> 
> That's the behavior implemented by Vitaly.
> 
> > [...]  
> 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-21  8:45             ` Vitaly Kuznetsov
@ 2021-01-21 13:49               ` Igor Mammedov
  2021-01-21 16:51                 ` Vitaly Kuznetsov
  0 siblings, 1 reply; 34+ messages in thread
From: Igor Mammedov @ 2021-01-21 13:49 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel,
	Dr. David Alan Gilbert, Eduardo Habkost

On Thu, 21 Jan 2021 09:45:33 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Wed, 20 Jan 2021 15:38:33 +0100
> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >  
> >> Igor Mammedov <imammedo@redhat.com> writes:
> >>   
> >> > On Fri, 15 Jan 2021 10:20:23 +0100
> >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >> >    
> >> >> suggestion is 
> >> >> 
> >> >> if I do:
> >> >> 
> >> >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
> >> >> 
> >> >> but if I do
> >> >> 
> >> >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
> >> >> (as hv_default enablement will overwrite everything)
> >> >> 
> >> >> How is this consistent?    
> >> > usual semantics for properties, is that the latest property overwrites,
> >> > the previous property value parsed from left to right.
> >> > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
> >> > if one needs more than that one should add more related features after that.
> >> >    
> >> 
> >> This semantics probably doesn't apply to 'hv-default' case IMO as my
> >> brain refuses to accept the fact that  
> > it's difficult probably because 'hv-default' is 'alias' property 
> > that covers all individual hv-foo features in one go and that individual
> > features are exposed to user, but otherwise it is just a property that
> > sets CPUID features or like any other property, and should be treated
> > like such.
> >  
> >> 'hv_default,hv_feature' != 'hv_feature,hv_default'
> >>
> >> which should express the same desire 'the default set PLUS the feature I
> >> want'.  
> > if hv_default were touching different data, I'd agree.
> > But in the end hv_default boils down to the same CPUID bits as individual
> > features:
> >
> >   hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
> >          !=
> >   hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)
> >  
> 
> In your case I treat 'hv_default' as 'hv_f1=on' and it says nothing
> about 'hv_f2' - neither it is enabled, nor it is disabled because when
> the corresponding machine type was released it just wasn't there.
> 
> >    
> >> I think I prefer sanity over purity in this case.  
> > what is sanity to one could be insanity for another,
> > so I pointed out the way properties expected to work today.
> >
> > But you are adding new semantic ('combine') to property/features parsing
> > (instead of current 'set' policy), and users will have to be aware of
> > this new behavior and add/maintain code for this special case.
> > (maybe I worry in vain, and no one will read docs and know about this
> > new property anyways)
> >
> > That will also push x86 CPUs consolidation farther away from other targets,
> > where there aren't any special casing for features parsing, just simple
> > left to right parsing with the latest property having overwriting previously
> > set value.  
> 
> In case this is somewhat important I suggest we get back to adding
> 'hyperv=on' machine type option and not do the 'aliasing' with
> 'hv_default'. I think it would be possible to support
> 
> '-M q35,hyper=on -cpu host,hv-stimer-direct=off' 
> 
> even if we need to add a custom handler for Hyper-V feature setting
> instead of just using bits in u64 as we need to remember both what was
> enabled and what was disabled to combine this with machine type property
> correctly.
> 
> > We are trying hard to reduce special cases and unify interfaces for same
> > components to simplify qemu and make it predictable/easier for users.
> >  
> 
> That's exactly the reason why we need simpler Hyper-V feature
> enablement! :-)
> 
> >  
> >> >> >> +    }
> >> >> >> +}
> >> >> >> +
> >> >> >>  /* Generic getter for "feature-words" and "filtered-features" properties */
> >> >> >>  static void x86_cpu_get_feature_words(Object *obj, Visitor *v,
> >> >> >>                                        const char *name, void *opaque,
> >> >> >> @@ -6955,10 +6973,26 @@ static void x86_cpu_initfn(Object *obj)
> >> >> >>      object_property_add_alias(obj, "pause_filter", obj, "pause-filter");
> >> >> >>      object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
> >> >> >>      object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
> >> >> >> +    object_property_add_alias(obj, "hv_default", obj, "hv-default");
> >> >> >>  
> >> >> >>      if (xcc->model) {
> >> >> >>          x86_cpu_load_model(cpu, xcc->model);
> >> >> >>      }
> >> >> >> +
> >> >> >> +    /* Hyper-V features enabled with 'hv-default=on' */
> >> >> >> +    cpu->hyperv_default_features = BIT(HYPERV_FEAT_RELAXED) |
> >> >> >> +        BIT(HYPERV_FEAT_VAPIC) | BIT(HYPERV_FEAT_TIME) |
> >> >> >> +        BIT(HYPERV_FEAT_CRASH) | BIT(HYPERV_FEAT_RESET) |
> >> >> >> +        BIT(HYPERV_FEAT_VPINDEX) | BIT(HYPERV_FEAT_RUNTIME) |
> >> >> >> +        BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_STIMER) |
> >> >> >> +        BIT(HYPERV_FEAT_FREQUENCIES) | BIT(HYPERV_FEAT_REENLIGHTENMENT) |
> >> >> >> +        BIT(HYPERV_FEAT_TLBFLUSH) | BIT(HYPERV_FEAT_IPI) |
> >> >> >> +        BIT(HYPERV_FEAT_STIMER_DIRECT);
> >> >> >> +
> >> >> >> +    /* Enlightened VMCS is only available on Intel/VMX */
> >> >> >> +    if (kvm_hv_evmcs_available()) {
> >> >> >> +        cpu->hyperv_default_features |= BIT(HYPERV_FEAT_EVMCS);
> >> >> >> +    }      
> >> >> > what if VVM is migrated to another host without evmcs,
> >> >> > will it change CPUID?
> >> >> >      
> >> >> 
> >> >> Evmcs is tightly coupled with VMX, we can't migrate when it's not
> >> >> there.    
> >> >
> >> > Are you saying mgmt will check and refuse to migrate to such host?
> >> >    
> >> 
> >> Is it possible to migrate a VM from a VMX-enabled host to a VMX-disabled
> >> one if VMX feature was exposed to the VM? Probably not, you will fail to
> >> create a VM on the destination host. Evmcs doesn't change anything in
> >> this regard, there are no hosts where VMX is available but EVMCS is not.  
> >
> > I'm not sure how evmcs should be handled,
> > can you point out what in this series makes sure that migration fails or
> > makes qemu not able to start in case kvm_hv_evmcs_available() returns false.
> >
> > So far I read snippet above as a problem:
> > 1:
> >   host supports evmcs:
> >   and exposes HYPERV_FEAT_EVMCS in CPUID  
> 
> Host with EVMCS is Intel
> 
> > 2: we migrate to host without evmcs  
> 
> Host without EVMCS is AMD, there are no other options. It is a pure
> software feature available for KVM-intel. And if your KVM is so old that
> it doesn't know anything about EVMCS, a bunch of other options from
> 'hv-default' will not start as well.
> > 2.1 start target QEMU, it happily creates vCPUs without
> > HYPERV_FEAT_EVMCS in CPUID  
> 
> No, it doesn't as on host1 we had at least VMX CPU feature enabled (or a
> CPU model implying it) to make this all work.
> 
> > 2.2 if I'm not mistaken CPUID is not part of migration stream,
> >     nothing could check and fail migration
> > 2.3 guest runs fine till it tries to use non existing feature, ..  
> 
> I'm also very sceptical about possibilities for migration
> Windows/Hyper-V VMs from Intel to AMD. Hyper-V doesn't even boot if you
> don't have fresh-enough CPU so the common denominator for Intel/AMD
> would definitely not work. 

Like you said host doesn't have to be AMD, just old enough kernel will
do the job. What exactly will prevent migration 'successfully' completing?

The way it's currently written migration stream won't prevent it.

One way that might solve issue is to add subsection that's enabled when
kvm_hv_evmcs_available() == true, and check on target that the feature
is available or fail migration.

Maybe Eduardo or David can add more how to deal with it if needed.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-21 13:27               ` Igor Mammedov
@ 2021-01-21 16:23                 ` Igor Mammedov
  2021-01-21 17:08                 ` Eduardo Habkost
  1 sibling, 0 replies; 34+ messages in thread
From: Igor Mammedov @ 2021-01-21 16:23 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Marcelo Tosatti, qemu-devel

On Thu, 21 Jan 2021 14:27:04 +0100
Igor Mammedov <imammedo@redhat.com> wrote:

> On Wed, 20 Jan 2021 15:49:09 -0500
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Wed, Jan 20, 2021 at 08:08:32PM +0100, Igor Mammedov wrote:  
> > > On Wed, 20 Jan 2021 15:38:33 +0100
> > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > >     
> > > > Igor Mammedov <imammedo@redhat.com> writes:
> > > >     
> > > > > On Fri, 15 Jan 2021 10:20:23 +0100
> > > > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > > >      
> > > > >> Igor Mammedov <imammedo@redhat.com> writes:
> > > > >>       
> > > > >> > On Thu,  7 Jan 2021 16:14:49 +0100
> > > > >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > > >> >        
> > > > >> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> > > > >> >> requires listing all currently supported enlightenments ("hv-*" CPU
> > > > >> >> features) explicitly. We do have 'hv-passthrough' mode enabling
> > > > >> >> everything but it can't be used in production as it prevents migration.
> > > > >> >> 
> > > > >> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> > > > >> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> > > > >> >> compat_props mechanism will be used to disable them for legacy machine types,
> > > > >> >> this will keep 'hv-default=on' configurations migratable.
> > > > >> >> 
> > > > >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > >> >> ---
> > > > >> >>  docs/hyperv.txt   | 16 +++++++++++++---
> > > > >> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
> > > > >> >>  target/i386/cpu.h |  5 +++++
> > > > >> >>  3 files changed, 56 insertions(+), 3 deletions(-)
> > > > >> >> 
> > > > >> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> > > > >> >> index 5df00da54fc4..a54c066cab09 100644
> > > > >> >> --- a/docs/hyperv.txt
> > > > >> >> +++ b/docs/hyperv.txt
> > > > >> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
> > > > >> >>  
> > > > >> >>  2. Setup
> > > > >> >>  =========
> > > > >> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> > > > >> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> > > > >> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
> > > > >> >> +'hv-default=on' CPU flag:
> > > > >> >>  
> > > > >> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> > > > >> >> +
> > > > >> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> > > > >> >> +e.g:
> > > > >> >> +
> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...        
> > > > >> >
> > > > >> > I'd put here not '...' but rather recommended list of flags, and update
> > > > >> > it every time when new feature added if necessary.
> > > > >> >        
> > > > >
> > > > > 1)
> > > > >        
> > > > >> This is an example of fine-grained enablement, there is no point to put
> > > > >> all the existing flags there (hv-default is the only recommended way
> > > > >> now, the rest is 'expert'/'debugging').      
> > > > > so users are kept in dark what hv-default disables/enables (and it might depend
> > > > > on machine version on top that). Doesn't look like a good documentation to me
> > > > > (sure everyone can go and read source code for it and try to figure out how
> > > > > it's supposed to work)      
> > > > 
> > > > 'hv-default' enables *all* currently supported enlightenments. When
> > > > using with an old machine type, it will enable *all* Hyper-V
> > > > enlightenmnets which were supported when the corresponding machine type
> > > > was released. I don't think we document all other cases when a machine
> > > > type is modified (i.e. where can I read how pc-q35-5.1 is different from
> > > > pc-q35-5.0 if I refuse to read the source code?)
> > > >     
> > > > >      
> > > > >>      
> > > > >> > (not to mention that if we had it to begin with, then new 'hv-default' won't
> > > > >> > be necessary, I still see it as functionality duplication but I will not oppose it)
> > > > >> >        
> > > > >> 
> > > > >> Unfortunately, upper layer tools don't read this doc and update
> > > > >> themselves to enable new features when they appear.      
> > > > > rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
> > > > > why stop there, just merge with yet another upper layer, it would save us a lot
> > > > > on communication protocols and simplify VM creation even more,
> > > > > and no one will have to read docs and write anything new on top.)
> > > > > There should be limit somewhere, where QEMU job ends and others pile hw abstraction
> > > > > layers on top of it.      
> > > > 
> > > > We have '-machine q35' and we don't require to list all the devices from
> > > > it. We have '-cpu Skylake-Server' and we don't require to configure all
> > > > the features manually. Why can't we have similar enablement for Hyper-V
> > > > emulation where we can't even see a real need for anything but 'enable
> > > > everything' option?
> > > > 
> > > > There is no 'one libvirt to rule them all' (fortunately or
> > > > unfortunately). And sometimes QEMU is the uppermost layer and there's no
> > > > 'libvirt' on top of it, this is also a perfectly valid use-case.
> > > >     
> > > > >      
> > > > >> Similarly, if when these tools use '-machine q35' they get all the new features we add
> > > > >> automatically, right?      
> > > > > it depends, in case of CPUs, new features usually 'off' by default
> > > > > for existing models. In case of bugs, features sometimes could be
> > > > > flipped and versioned machines were used to keep broken CPU models
> > > > > on old machine types.
> > > > >      
> > > > 
> > > > That's why I was saying that Hyper-V enlightenments hardly resemble
> > > > 'hardware' CPU features.    
> > > Well, Microsoft chose to implement them as hardware concept (CPUID leaf),
> > > and I prefer to treat them the same way as any other CPUID bits.
> > >     
> > > >     
> > > > >          
> > > > >> >> +It is also possible to disable individual enlightenments from the default list,
> > > > >> >> +this can be used for debugging purposes:
> > > > >> >> +
> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
> > > > >> >>  
> > > > >> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
> > > > >> >>  check that the supplied configuration is sane.
> > > > >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > >> >> index 48007a876e32..99338de00f78 100644
> > > > >> >> --- a/target/i386/cpu.c
> > > > >> >> +++ b/target/i386/cpu.c
> > > > >> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
> > > > >> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
> > > > >> >>  }
> > > > >> >>  
> > > > >> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
> > > > >> >> +{
> > > > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > > > >> >> +
> > > > >> >> +    return cpu->hyperv_default;
> > > > >> >> +}
> > > > >> >> +
> > > > >> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> > > > >> >> +{
> > > > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > > > >> >> +
> > > > >> >> +    cpu->hyperv_default = value;
> > > > >> >> +
> > > > >> >> +    if (value) {
> > > > >> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;        
> > > > >> >
> > > > >> > s/|="/=/ please,
> > > > >> > i.e. no option overrides whatever was specified before to keep semantics consistent.
> > > > >> >        
> > > > >> 
> > > > >> Hm,
> > > > >>       
> > > > >      
> > > > >> this doesn't matter for the most recent machine type as
> > > > >> hyperv_default_features has all the features but imagine you're running
> > > > >> an older machine type which doesn't have 'hv_feature'. Now your      
> > > > > normally one shouldn't use new feature with old machine type as it makes
> > > > > VM non-migratable to older QEMU that has this machine type but not this feature.
> > > > >
> > > > > nitpicking:
> > > > >   according to (1) user should not use 'hv_feature' on old machine since
> > > > >   hv_default should cover all their needs (well they don't know what
> > > > > hv_default actually is).      
> > > > 
> > > > Normally yes but I can imagine sticking to some old machine type for
> > > > other-than-hyperv-enlightenments purposes and still wanting to add a
> > > > newly introduced enlightenment. Migration is not always a must.
> > > >     
> > > > >      
> > > > >> suggestion is 
> > > > >> 
> > > > >> if I do:
> > > > >> 
> > > > >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
> > > > >> 
> > > > >> but if I do
> > > > >> 
> > > > >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
> > > > >> (as hv_default enablement will overwrite everything)
> > > > >> 
> > > > >> How is this consistent?      
> > > > > usual semantics for properties, is that the latest property overwrites,
> > > > > the previous property value parsed from left to right.
> > > > > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
> > > > > if one needs more than that one should add more related features after that.
> > > > >      
> > > > 
> > > > This semantics probably doesn't apply to 'hv-default' case IMO as my
> > > > brain refuses to accept the fact that    
> > > it's difficult probably because 'hv-default' is 'alias' property 
> > > that covers all individual hv-foo features in one go and that individual
> > > features are exposed to user, but otherwise it is just a property that
> > > sets CPUID features or like any other property, and should be treated like such.
> > >     
> > > > 'hv_default,hv_feature' != 'hv_feature,hv_default'
> > > >
> > > > which should express the same desire 'the default set PLUS the feature I
> > > > want'.    
> > > if hv_default were touching different data, I'd agree.
> > > But in the end hv_default boils down to the same CPUID bits as individual
> > > features:
> > > 
> > >   hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
> > >          !=
> > >   hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)    
> > 
> > I don't know why you chose to define "hv_default" as
> > hv_f1=on,hv_f2=off.  If hv_f2 is not enabled by hv_default, it
> > doesn't need to be touched by hv_default at all.  
> 
> Essentially I was thinking about hv_default=on as setting default value
> of hv CPUID leaf i.e. like doc claims, 'all' hv_* features (including
> turned off and unused bits) which always sets leaf to its default state.
> 
> Now lets consider following possible situation
> using combine' approach (leaf |= some_bits):
> 
> QEMU-6.0: initially we have all possible features enabled
>                 hv_default = (hv_f1=on,hv_f2=on)
> 
> hv_f2=on,hv_default=on == hv_f1=on,hv_f2=on
> 
> QEMU-6.1: disabled hv_f2=off that was causing problems
> 
> hv_default = (hv_f1=on,hv_f2=off)
> 
> however due to ORing hv_default doesn't fix issue for the same CLI
> (i.e. it doesn't have expected effect)
> 
> hv_f2=on,hv_default=on => hv_f1=on,hv_f2=on
> 
> if one would use usual 'set' semantics (leaf = all_bits),
> then new hv_default value will have desired effect despite of botched CLI,
> just by virtue of property following typical 'last set' semantics:
> 
>  => hv_f1=on,hv_f2=off  
> 
> If we assume that we 'never ever' will need to disable feature bits
> than it doesn't matter which approach to use, however a look at
> pc_compat arrays shows that features are being enabled/disabled
> all the time.

Also there should be a good reason for adding new semantics and
deviating from typical property behavior.




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-21 13:49               ` Igor Mammedov
@ 2021-01-21 16:51                 ` Vitaly Kuznetsov
  0 siblings, 0 replies; 34+ messages in thread
From: Vitaly Kuznetsov @ 2021-01-21 16:51 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, Marcelo Tosatti, qemu-devel,
	Dr. David Alan Gilbert, Eduardo Habkost

Igor Mammedov <imammedo@redhat.com> writes:

> On Thu, 21 Jan 2021 09:45:33 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> >
>> > So far I read snippet above as a problem:
>> > 1:
>> >   host supports evmcs:
>> >   and exposes HYPERV_FEAT_EVMCS in CPUID  
>> 
>> Host with EVMCS is Intel
>> 
>> > 2: we migrate to host without evmcs  
>> 
>> Host without EVMCS is AMD, there are no other options. It is a pure
>> software feature available for KVM-intel. And if your KVM is so old that
>> it doesn't know anything about EVMCS, a bunch of other options from
>> 'hv-default' will not start as well.
>> > 2.1 start target QEMU, it happily creates vCPUs without
>> > HYPERV_FEAT_EVMCS in CPUID  
>> 
>> No, it doesn't as on host1 we had at least VMX CPU feature enabled (or a
>> CPU model implying it) to make this all work.
>> 
>> > 2.2 if I'm not mistaken CPUID is not part of migration stream,
>> >     nothing could check and fail migration
>> > 2.3 guest runs fine till it tries to use non existing feature, ..  
>> 
>> I'm also very sceptical about possibilities for migration
>> Windows/Hyper-V VMs from Intel to AMD. Hyper-V doesn't even boot if you
>> don't have fresh-enough CPU so the common denominator for Intel/AMD
>> would definitely not work. 
>
> Like you said host doesn't have to be AMD, just old enough kernel will
> do the job. What exactly will prevent migration 'successfully' completing?
>

First, you can't start a VM with 'hv-default' with an old-enough kernel
because it won't have many other 'hv-' enlightenments
implemented. 'hv-default' will only work for a 'recent enough' kernel
(>= 5.0 when hv-stimer-direct was implemented). 

You can probably try doing '-cpu xxx,hv_default,hv-stimer-direct=off' to
trigger the problem but then KVM should also support nested state
migration to actually migrate a VM using VMX and EVMCS support for it
also emerged in 5.0. I believe that trying to call KVM_SET_NESTED_STATE
(which only appeared in 4.19 btw) on something in between will fail.

> The way it's currently written migration stream won't prevent it.
>
> One way that might solve issue is to add subsection that's enabled when
> kvm_hv_evmcs_available() == true, and check on target that the feature
> is available or fail migration.

Yes, we can but I don't think there's a real issue worth fighting
for. Nested migration was so broken in upstream KVM untill recently that
I don't see why 'old kernel' can be a problem at all. And, again, Intel
to AMD migration is likely off question.

>
> Maybe Eduardo or David can add more how to deal with it if needed.
>

-- 
Vitaly



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-21 13:27               ` Igor Mammedov
  2021-01-21 16:23                 ` Igor Mammedov
@ 2021-01-21 17:08                 ` Eduardo Habkost
  2021-01-25 13:42                   ` David Edmondson
  1 sibling, 1 reply; 34+ messages in thread
From: Eduardo Habkost @ 2021-01-21 17:08 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Marcelo Tosatti, qemu-devel

On Thu, Jan 21, 2021 at 02:27:04PM +0100, Igor Mammedov wrote:
> On Wed, 20 Jan 2021 15:49:09 -0500
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Wed, Jan 20, 2021 at 08:08:32PM +0100, Igor Mammedov wrote:
> > > On Wed, 20 Jan 2021 15:38:33 +0100
> > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > >   
> > > > Igor Mammedov <imammedo@redhat.com> writes:
> > > >   
> > > > > On Fri, 15 Jan 2021 10:20:23 +0100
> > > > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > > >    
> > > > >> Igor Mammedov <imammedo@redhat.com> writes:
> > > > >>     
> > > > >> > On Thu,  7 Jan 2021 16:14:49 +0100
> > > > >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > > >> >      
> > > > >> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
> > > > >> >> requires listing all currently supported enlightenments ("hv-*" CPU
> > > > >> >> features) explicitly. We do have 'hv-passthrough' mode enabling
> > > > >> >> everything but it can't be used in production as it prevents migration.
> > > > >> >> 
> > > > >> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
> > > > >> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
> > > > >> >> compat_props mechanism will be used to disable them for legacy machine types,
> > > > >> >> this will keep 'hv-default=on' configurations migratable.
> > > > >> >> 
> > > > >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > > >> >> ---
> > > > >> >>  docs/hyperv.txt   | 16 +++++++++++++---
> > > > >> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
> > > > >> >>  target/i386/cpu.h |  5 +++++
> > > > >> >>  3 files changed, 56 insertions(+), 3 deletions(-)
> > > > >> >> 
> > > > >> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> > > > >> >> index 5df00da54fc4..a54c066cab09 100644
> > > > >> >> --- a/docs/hyperv.txt
> > > > >> >> +++ b/docs/hyperv.txt
> > > > >> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
> > > > >> >>  
> > > > >> >>  2. Setup
> > > > >> >>  =========
> > > > >> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
> > > > >> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
> > > > >> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
> > > > >> >> +'hv-default=on' CPU flag:
> > > > >> >>  
> > > > >> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
> > > > >> >> +
> > > > >> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
> > > > >> >> +e.g:
> > > > >> >> +
> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...      
> > > > >> >
> > > > >> > I'd put here not '...' but rather recommended list of flags, and update
> > > > >> > it every time when new feature added if necessary.
> > > > >> >      
> > > > >
> > > > > 1)
> > > > >      
> > > > >> This is an example of fine-grained enablement, there is no point to put
> > > > >> all the existing flags there (hv-default is the only recommended way
> > > > >> now, the rest is 'expert'/'debugging').    
> > > > > so users are kept in dark what hv-default disables/enables (and it might depend
> > > > > on machine version on top that). Doesn't look like a good documentation to me
> > > > > (sure everyone can go and read source code for it and try to figure out how
> > > > > it's supposed to work)    
> > > > 
> > > > 'hv-default' enables *all* currently supported enlightenments. When
> > > > using with an old machine type, it will enable *all* Hyper-V
> > > > enlightenmnets which were supported when the corresponding machine type
> > > > was released. I don't think we document all other cases when a machine
> > > > type is modified (i.e. where can I read how pc-q35-5.1 is different from
> > > > pc-q35-5.0 if I refuse to read the source code?)
> > > >   
> > > > >    
> > > > >>    
> > > > >> > (not to mention that if we had it to begin with, then new 'hv-default' won't
> > > > >> > be necessary, I still see it as functionality duplication but I will not oppose it)
> > > > >> >      
> > > > >> 
> > > > >> Unfortunately, upper layer tools don't read this doc and update
> > > > >> themselves to enable new features when they appear.    
> > > > > rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
> > > > > why stop there, just merge with yet another upper layer, it would save us a lot
> > > > > on communication protocols and simplify VM creation even more,
> > > > > and no one will have to read docs and write anything new on top.)
> > > > > There should be limit somewhere, where QEMU job ends and others pile hw abstraction
> > > > > layers on top of it.    
> > > > 
> > > > We have '-machine q35' and we don't require to list all the devices from
> > > > it. We have '-cpu Skylake-Server' and we don't require to configure all
> > > > the features manually. Why can't we have similar enablement for Hyper-V
> > > > emulation where we can't even see a real need for anything but 'enable
> > > > everything' option?
> > > > 
> > > > There is no 'one libvirt to rule them all' (fortunately or
> > > > unfortunately). And sometimes QEMU is the uppermost layer and there's no
> > > > 'libvirt' on top of it, this is also a perfectly valid use-case.
> > > >   
> > > > >    
> > > > >> Similarly, if when these tools use '-machine q35' they get all the new features we add
> > > > >> automatically, right?    
> > > > > it depends, in case of CPUs, new features usually 'off' by default
> > > > > for existing models. In case of bugs, features sometimes could be
> > > > > flipped and versioned machines were used to keep broken CPU models
> > > > > on old machine types.
> > > > >    
> > > > 
> > > > That's why I was saying that Hyper-V enlightenments hardly resemble
> > > > 'hardware' CPU features.  
> > > Well, Microsoft chose to implement them as hardware concept (CPUID leaf),
> > > and I prefer to treat them the same way as any other CPUID bits.
> > >   
> > > >   
> > > > >        
> > > > >> >> +It is also possible to disable individual enlightenments from the default list,
> > > > >> >> +this can be used for debugging purposes:
> > > > >> >> +
> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
> > > > >> >>  
> > > > >> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
> > > > >> >>  check that the supplied configuration is sane.
> > > > >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > >> >> index 48007a876e32..99338de00f78 100644
> > > > >> >> --- a/target/i386/cpu.c
> > > > >> >> +++ b/target/i386/cpu.c
> > > > >> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
> > > > >> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
> > > > >> >>  }
> > > > >> >>  
> > > > >> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
> > > > >> >> +{
> > > > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > > > >> >> +
> > > > >> >> +    return cpu->hyperv_default;
> > > > >> >> +}
> > > > >> >> +
> > > > >> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
> > > > >> >> +{
> > > > >> >> +    X86CPU *cpu = X86_CPU(obj);
> > > > >> >> +
> > > > >> >> +    cpu->hyperv_default = value;
> > > > >> >> +
> > > > >> >> +    if (value) {
> > > > >> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;      
> > > > >> >
> > > > >> > s/|="/=/ please,
> > > > >> > i.e. no option overrides whatever was specified before to keep semantics consistent.
> > > > >> >      
> > > > >> 
> > > > >> Hm,
> > > > >>     
> > > > >    
> > > > >> this doesn't matter for the most recent machine type as
> > > > >> hyperv_default_features has all the features but imagine you're running
> > > > >> an older machine type which doesn't have 'hv_feature'. Now your    
> > > > > normally one shouldn't use new feature with old machine type as it makes
> > > > > VM non-migratable to older QEMU that has this machine type but not this feature.
> > > > >
> > > > > nitpicking:
> > > > >   according to (1) user should not use 'hv_feature' on old machine since
> > > > >   hv_default should cover all their needs (well they don't know what
> > > > > hv_default actually is).    
> > > > 
> > > > Normally yes but I can imagine sticking to some old machine type for
> > > > other-than-hyperv-enlightenments purposes and still wanting to add a
> > > > newly introduced enlightenment. Migration is not always a must.
> > > >   
> > > > >    
> > > > >> suggestion is 
> > > > >> 
> > > > >> if I do:
> > > > >> 
> > > > >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
> > > > >> 
> > > > >> but if I do
> > > > >> 
> > > > >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
> > > > >> (as hv_default enablement will overwrite everything)
> > > > >> 
> > > > >> How is this consistent?    
> > > > > usual semantics for properties, is that the latest property overwrites,
> > > > > the previous property value parsed from left to right.
> > > > > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
> > > > > if one needs more than that one should add more related features after that.
> > > > >    
> > > > 
> > > > This semantics probably doesn't apply to 'hv-default' case IMO as my
> > > > brain refuses to accept the fact that  
> > > it's difficult probably because 'hv-default' is 'alias' property 
> > > that covers all individual hv-foo features in one go and that individual
> > > features are exposed to user, but otherwise it is just a property that
> > > sets CPUID features or like any other property, and should be treated like such.
> > >   
> > > > 'hv_default,hv_feature' != 'hv_feature,hv_default'
> > > >
> > > > which should express the same desire 'the default set PLUS the feature I
> > > > want'.  
> > > if hv_default were touching different data, I'd agree.
> > > But in the end hv_default boils down to the same CPUID bits as individual
> > > features:
> > > 
> > >   hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
> > >          !=
> > >   hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)  
> > 
> > I don't know why you chose to define "hv_default" as
> > hv_f1=on,hv_f2=off.  If hv_f2 is not enabled by hv_default, it
> > doesn't need to be touched by hv_default at all.
> 
> Essentially I was thinking about hv_default=on as setting default value
> of hv CPUID leaf i.e. like doc claims, 'all' hv_* features (including
> turned off and unused bits) which always sets leaf to its default state.
> 
> Now lets consider following possible situation
> using combine' approach (leaf |= some_bits):
> 
> QEMU-6.0: initially we have all possible features enabled
>                 hv_default = (hv_f1=on,hv_f2=on)
> 
> hv_f2=on,hv_default=on == hv_f1=on,hv_f2=on
> 
> QEMU-6.1: disabled hv_f2=off that was causing problems
> 
> hv_default = (hv_f1=on,hv_f2=off)

Why would we choose to do that?

If we decide f2 shouldn't be part of the default, we'll redefine
hv_default as:

  hv_default = (hv_f1=on)

> 
> however due to ORing hv_default doesn't fix issue for the same CLI
> (i.e. it doesn't have expected effect)
> 
> hv_f2=on,hv_default=on => hv_f1=on,hv_f2=on
> 
> if one would use usual 'set' semantics (leaf = all_bits),
> then new hv_default value will have desired effect despite of botched CLI,
> just by virtue of property following typical 'last set' semantics:
> 
>  => hv_f1=on,hv_f2=off
> 
> If we assume that we 'never ever' will need to disable feature bits
> than it doesn't matter which approach to use, however a look at
> pc_compat arrays shows that features are being enabled/disabled
> all the time.

I'm pretty sure that "hv_default=on will also disable features
that appear in the command line" will not be a requirement.


> 
> PS:
> I'd rename hv_default => hv_set_default,
> since we would need hv_default[_value] property later on to set compat value
> based on machine type version.
>     
> > > > I think I prefer sanity over purity in this case.  
> > > what is sanity to one could be insanity for another,
> > > so I pointed out the way properties expected to work today.
> > > 
> > > But you are adding new semantic ('combine') to property/features parsing
> > > (instead of current 'set' policy), and users will have to be aware of
> > > this new behavior and add/maintain code for this special case.
> > > (maybe I worry in vain, and no one will read docs and know about this
> > > new property anyways)
> > > 
> > > That will also push x86 CPUs consolidation farther away from other targets,
> > > where there aren't any special casing for features parsing, just simple
> > > left to right parsing with the latest property having overwriting previously
> > > set value.
> > > We are trying hard to reduce special cases and unify interfaces for same
> > > components to simplify qemu and make it predictable/easier for users.
> > >   
> > 
> > What you are proposing diverges from other targets, actually.
> > See target/s390x/cpu_models.c:set_feature_group() for example.
> > Enabling a feature group in s390x only enables a set of feature
> > bits, and doesn't touch the rest.
> Looking at code, it has the same issue as I described above

I don't see why that's an issue.  This is how feature groups were
designed, and it works.


> 
> 
> > In other words, if hv_default includes hv_f1+hv_f2 (and not hv_f3
> > or hv_f4), this means:
> > 
> >    hv_default,hv_f3=on,hv_f4=off => (hv_f1=on,hv_f2=on),hv_f3=on,hv_f4=off
> >           ==
> >    hv_f3=on,hv_f4=off,hv_default => hv_f3=on,hv_f4=off,(hv_f2=on,hv_f2=on)
> > 
> > That would also mean:
> > 
> >    hv_default,hv_f1=on,hv_f2=off => (hv_f1=on,hv_f2=on),hv_f1=on,hv_f2=off
> >           !=
> >    hv_f1=on,hv_f2=off,hv_default => hv_f1=on,hv_f2=off,(hv_f2=on,hv_f2=on)
> > 
> > That's the behavior implemented by Vitaly.
> > 
> > > [...]  
> > 
> 

-- 
Eduardo



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 18/19] i386: provide simple 'hv-default=on' option
  2021-01-21 17:08                 ` Eduardo Habkost
@ 2021-01-25 13:42                   ` David Edmondson
  0 siblings, 0 replies; 34+ messages in thread
From: David Edmondson @ 2021-01-25 13:42 UTC (permalink / raw)
  To: Eduardo Habkost, Igor Mammedov
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Marcelo Tosatti, qemu-devel

On Thursday, 2021-01-21 at 12:08:02 -05, Eduardo Habkost wrote:

> On Thu, Jan 21, 2021 at 02:27:04PM +0100, Igor Mammedov wrote:
>> On Wed, 20 Jan 2021 15:49:09 -0500
>> Eduardo Habkost <ehabkost@redhat.com> wrote:
>> 
>> > On Wed, Jan 20, 2021 at 08:08:32PM +0100, Igor Mammedov wrote:
>> > > On Wed, 20 Jan 2021 15:38:33 +0100
>> > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> > >   
>> > > > Igor Mammedov <imammedo@redhat.com> writes:
>> > > >   
>> > > > > On Fri, 15 Jan 2021 10:20:23 +0100
>> > > > > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> > > > >    
>> > > > >> Igor Mammedov <imammedo@redhat.com> writes:
>> > > > >>     
>> > > > >> > On Thu,  7 Jan 2021 16:14:49 +0100
>> > > > >> > Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> > > > >> >      
>> > > > >> >> Enabling Hyper-V emulation for a Windows VM is a tiring experience as it
>> > > > >> >> requires listing all currently supported enlightenments ("hv-*" CPU
>> > > > >> >> features) explicitly. We do have 'hv-passthrough' mode enabling
>> > > > >> >> everything but it can't be used in production as it prevents migration.
>> > > > >> >> 
>> > > > >> >> Introduce a simple 'hv-default=on' CPU flag enabling all currently supported
>> > > > >> >> Hyper-V enlightenments. Later, when new enlightenments get implemented,
>> > > > >> >> compat_props mechanism will be used to disable them for legacy machine types,
>> > > > >> >> this will keep 'hv-default=on' configurations migratable.
>> > > > >> >> 
>> > > > >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> > > > >> >> ---
>> > > > >> >>  docs/hyperv.txt   | 16 +++++++++++++---
>> > > > >> >>  target/i386/cpu.c | 38 ++++++++++++++++++++++++++++++++++++++
>> > > > >> >>  target/i386/cpu.h |  5 +++++
>> > > > >> >>  3 files changed, 56 insertions(+), 3 deletions(-)
>> > > > >> >> 
>> > > > >> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> > > > >> >> index 5df00da54fc4..a54c066cab09 100644
>> > > > >> >> --- a/docs/hyperv.txt
>> > > > >> >> +++ b/docs/hyperv.txt
>> > > > >> >> @@ -17,10 +17,20 @@ compatible hypervisor and use Hyper-V specific features.
>> > > > >> >>  
>> > > > >> >>  2. Setup
>> > > > >> >>  =========
>> > > > >> >> -No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
>> > > > >> >> -QEMU, individual enlightenments can be enabled through CPU flags, e.g:
>> > > > >> >> +All currently supported Hyper-V enlightenments can be enabled by specifying
>> > > > >> >> +'hv-default=on' CPU flag:
>> > > > >> >>  
>> > > > >> >> -  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
>> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default ...
>> > > > >> >> +
>> > > > >> >> +Alternatively, it is possible to do fine-grained enablement through CPU flags,
>> > > > >> >> +e.g:
>> > > > >> >> +
>> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-relaxed,hv-vpindex,hv-time ...      
>> > > > >> >
>> > > > >> > I'd put here not '...' but rather recommended list of flags, and update
>> > > > >> > it every time when new feature added if necessary.
>> > > > >> >      
>> > > > >
>> > > > > 1)
>> > > > >      
>> > > > >> This is an example of fine-grained enablement, there is no point to put
>> > > > >> all the existing flags there (hv-default is the only recommended way
>> > > > >> now, the rest is 'expert'/'debugging').    
>> > > > > so users are kept in dark what hv-default disables/enables (and it might depend
>> > > > > on machine version on top that). Doesn't look like a good documentation to me
>> > > > > (sure everyone can go and read source code for it and try to figure out how
>> > > > > it's supposed to work)    
>> > > > 
>> > > > 'hv-default' enables *all* currently supported enlightenments. When
>> > > > using with an old machine type, it will enable *all* Hyper-V
>> > > > enlightenmnets which were supported when the corresponding machine type
>> > > > was released. I don't think we document all other cases when a machine
>> > > > type is modified (i.e. where can I read how pc-q35-5.1 is different from
>> > > > pc-q35-5.0 if I refuse to read the source code?)
>> > > >   
>> > > > >    
>> > > > >>    
>> > > > >> > (not to mention that if we had it to begin with, then new 'hv-default' won't
>> > > > >> > be necessary, I still see it as functionality duplication but I will not oppose it)
>> > > > >> >      
>> > > > >> 
>> > > > >> Unfortunately, upper layer tools don't read this doc and update
>> > > > >> themselves to enable new features when they appear.    
>> > > > > rant: (just merge all libvirt into QEMU, and make VM configuration less low-level.
>> > > > > why stop there, just merge with yet another upper layer, it would save us a lot
>> > > > > on communication protocols and simplify VM creation even more,
>> > > > > and no one will have to read docs and write anything new on top.)
>> > > > > There should be limit somewhere, where QEMU job ends and others pile hw abstraction
>> > > > > layers on top of it.    
>> > > > 
>> > > > We have '-machine q35' and we don't require to list all the devices from
>> > > > it. We have '-cpu Skylake-Server' and we don't require to configure all
>> > > > the features manually. Why can't we have similar enablement for Hyper-V
>> > > > emulation where we can't even see a real need for anything but 'enable
>> > > > everything' option?
>> > > > 
>> > > > There is no 'one libvirt to rule them all' (fortunately or
>> > > > unfortunately). And sometimes QEMU is the uppermost layer and there's no
>> > > > 'libvirt' on top of it, this is also a perfectly valid use-case.
>> > > >   
>> > > > >    
>> > > > >> Similarly, if when these tools use '-machine q35' they get all the new features we add
>> > > > >> automatically, right?    
>> > > > > it depends, in case of CPUs, new features usually 'off' by default
>> > > > > for existing models. In case of bugs, features sometimes could be
>> > > > > flipped and versioned machines were used to keep broken CPU models
>> > > > > on old machine types.
>> > > > >    
>> > > > 
>> > > > That's why I was saying that Hyper-V enlightenments hardly resemble
>> > > > 'hardware' CPU features.  
>> > > Well, Microsoft chose to implement them as hardware concept (CPUID leaf),
>> > > and I prefer to treat them the same way as any other CPUID bits.
>> > >   
>> > > >   
>> > > > >        
>> > > > >> >> +It is also possible to disable individual enlightenments from the default list,
>> > > > >> >> +this can be used for debugging purposes:
>> > > > >> >> +
>> > > > >> >> +  qemu-system-x86_64 --enable-kvm --cpu host,hv-default=on,hv-evmcs=off ...
>> > > > >> >>  
>> > > > >> >>  Sometimes there are dependencies between enlightenments, QEMU is supposed to
>> > > > >> >>  check that the supplied configuration is sane.
>> > > > >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> > > > >> >> index 48007a876e32..99338de00f78 100644
>> > > > >> >> --- a/target/i386/cpu.c
>> > > > >> >> +++ b/target/i386/cpu.c
>> > > > >> >> @@ -4552,6 +4552,24 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, const char *name,
>> > > > >> >>      cpu->env.tsc_khz = cpu->env.user_tsc_khz = value / 1000;
>> > > > >> >>  }
>> > > > >> >>  
>> > > > >> >> +static bool x86_hv_default_get(Object *obj, Error **errp)
>> > > > >> >> +{
>> > > > >> >> +    X86CPU *cpu = X86_CPU(obj);
>> > > > >> >> +
>> > > > >> >> +    return cpu->hyperv_default;
>> > > > >> >> +}
>> > > > >> >> +
>> > > > >> >> +static void x86_hv_default_set(Object *obj, bool value, Error **errp)
>> > > > >> >> +{
>> > > > >> >> +    X86CPU *cpu = X86_CPU(obj);
>> > > > >> >> +
>> > > > >> >> +    cpu->hyperv_default = value;
>> > > > >> >> +
>> > > > >> >> +    if (value) {
>> > > > >> >> +        cpu->hyperv_features |= cpu->hyperv_default_features;      
>> > > > >> >
>> > > > >> > s/|="/=/ please,
>> > > > >> > i.e. no option overrides whatever was specified before to keep semantics consistent.
>> > > > >> >      
>> > > > >> 
>> > > > >> Hm,
>> > > > >>     
>> > > > >    
>> > > > >> this doesn't matter for the most recent machine type as
>> > > > >> hyperv_default_features has all the features but imagine you're running
>> > > > >> an older machine type which doesn't have 'hv_feature'. Now your    
>> > > > > normally one shouldn't use new feature with old machine type as it makes
>> > > > > VM non-migratable to older QEMU that has this machine type but not this feature.
>> > > > >
>> > > > > nitpicking:
>> > > > >   according to (1) user should not use 'hv_feature' on old machine since
>> > > > >   hv_default should cover all their needs (well they don't know what
>> > > > > hv_default actually is).    
>> > > > 
>> > > > Normally yes but I can imagine sticking to some old machine type for
>> > > > other-than-hyperv-enlightenments purposes and still wanting to add a
>> > > > newly introduced enlightenment. Migration is not always a must.
>> > > >   
>> > > > >    
>> > > > >> suggestion is 
>> > > > >> 
>> > > > >> if I do:
>> > > > >> 
>> > > > >> 'hv_default,hv_feature=on' I will get "hyperv_default_features | hv_feature"
>> > > > >> 
>> > > > >> but if I do
>> > > > >> 
>> > > > >> 'hv_feature=on,hv_default' I will just get 'hyperv_default_features'
>> > > > >> (as hv_default enablement will overwrite everything)
>> > > > >> 
>> > > > >> How is this consistent?    
>> > > > > usual semantics for properties, is that the latest property overwrites,
>> > > > > the previous property value parsed from left to right.
>> > > > > (i.e. if one asked for hv_default, one gets it related CPUID bit set/unset,
>> > > > > if one needs more than that one should add more related features after that.
>> > > > >    
>> > > > 
>> > > > This semantics probably doesn't apply to 'hv-default' case IMO as my
>> > > > brain refuses to accept the fact that  
>> > > it's difficult probably because 'hv-default' is 'alias' property 
>> > > that covers all individual hv-foo features in one go and that individual
>> > > features are exposed to user, but otherwise it is just a property that
>> > > sets CPUID features or like any other property, and should be treated like such.
>> > >   
>> > > > 'hv_default,hv_feature' != 'hv_feature,hv_default'
>> > > >
>> > > > which should express the same desire 'the default set PLUS the feature I
>> > > > want'.  
>> > > if hv_default were touching different data, I'd agree.
>> > > But in the end hv_default boils down to the same CPUID bits as individual
>> > > features:
>> > > 
>> > >   hv_default,hv_f2 => (hv_f1=on,hv_f2=off),hv_f2=on
>> > >          !=
>> > >   hv_f2,hv_default => hv_f2=on,(hv_f1=on,hv_f2=off)  
>> > 
>> > I don't know why you chose to define "hv_default" as
>> > hv_f1=on,hv_f2=off.  If hv_f2 is not enabled by hv_default, it
>> > doesn't need to be touched by hv_default at all.
>> 
>> Essentially I was thinking about hv_default=on as setting default value
>> of hv CPUID leaf i.e. like doc claims, 'all' hv_* features (including
>> turned off and unused bits) which always sets leaf to its default state.
>> 
>> Now lets consider following possible situation
>> using combine' approach (leaf |= some_bits):
>> 
>> QEMU-6.0: initially we have all possible features enabled
>>                 hv_default = (hv_f1=on,hv_f2=on)
>> 
>> hv_f2=on,hv_default=on == hv_f1=on,hv_f2=on
>> 
>> QEMU-6.1: disabled hv_f2=off that was causing problems
>> 
>> hv_default = (hv_f1=on,hv_f2=off)
>
> Why would we choose to do that?
>
> If we decide f2 shouldn't be part of the default, we'll redefine
> hv_default as:
>
>   hv_default = (hv_f1=on)
>
>> 
>> however due to ORing hv_default doesn't fix issue for the same CLI
>> (i.e. it doesn't have expected effect)
>> 
>> hv_f2=on,hv_default=on => hv_f1=on,hv_f2=on
>> 
>> if one would use usual 'set' semantics (leaf = all_bits),
>> then new hv_default value will have desired effect despite of botched CLI,
>> just by virtue of property following typical 'last set' semantics:
>> 
>>  => hv_f1=on,hv_f2=off
>> 
>> If we assume that we 'never ever' will need to disable feature bits
>> than it doesn't matter which approach to use, however a look at
>> pc_compat arrays shows that features are being enabled/disabled
>> all the time.
>
> I'm pretty sure that "hv_default=on will also disable features
> that appear in the command line" will not be a requirement.

Is there a definitive conclusion to this?

In reading the thread I realised that the patch I sent adding
"kvm-no-defaults" may fall foul of the "process properties in order"
rule.

That is..

kvm_clock=on,kvm_no_defaults=on
  ==
kvm_no_defaults=on,kvm_clock=on

...which would not be the case if the properties are processed strictly
in order (because in the first case kvm_no_defaults would disable
kvm_clock).

>> 
>> PS:
>> I'd rename hv_default => hv_set_default,
>> since we would need hv_default[_value] property later on to set compat value
>> based on machine type version.
>>     
>> > > > I think I prefer sanity over purity in this case.  
>> > > what is sanity to one could be insanity for another,
>> > > so I pointed out the way properties expected to work today.
>> > > 
>> > > But you are adding new semantic ('combine') to property/features parsing
>> > > (instead of current 'set' policy), and users will have to be aware of
>> > > this new behavior and add/maintain code for this special case.
>> > > (maybe I worry in vain, and no one will read docs and know about this
>> > > new property anyways)
>> > > 
>> > > That will also push x86 CPUs consolidation farther away from other targets,
>> > > where there aren't any special casing for features parsing, just simple
>> > > left to right parsing with the latest property having overwriting previously
>> > > set value.
>> > > We are trying hard to reduce special cases and unify interfaces for same
>> > > components to simplify qemu and make it predictable/easier for users.
>> > >   
>> > 
>> > What you are proposing diverges from other targets, actually.
>> > See target/s390x/cpu_models.c:set_feature_group() for example.
>> > Enabling a feature group in s390x only enables a set of feature
>> > bits, and doesn't touch the rest.
>> Looking at code, it has the same issue as I described above
>
> I don't see why that's an issue.  This is how feature groups were
> designed, and it works.
>
>
>> 
>> 
>> > In other words, if hv_default includes hv_f1+hv_f2 (and not hv_f3
>> > or hv_f4), this means:
>> > 
>> >    hv_default,hv_f3=on,hv_f4=off => (hv_f1=on,hv_f2=on),hv_f3=on,hv_f4=off
>> >           ==
>> >    hv_f3=on,hv_f4=off,hv_default => hv_f3=on,hv_f4=off,(hv_f2=on,hv_f2=on)
>> > 
>> > That would also mean:
>> > 
>> >    hv_default,hv_f1=on,hv_f2=off => (hv_f1=on,hv_f2=on),hv_f1=on,hv_f2=off
>> >           !=
>> >    hv_f1=on,hv_f2=off,hv_default => hv_f1=on,hv_f2=off,(hv_f2=on,hv_f2=on)
>> > 
>> > That's the behavior implemented by Vitaly.
>> > 
>> > > [...]  
>> > 
>> 
>
> -- 
> Eduardo

dme.
-- 
There's too many people on the bus from the airport.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-01-25 13:43 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-07 15:06 [PATCH v3 00/19] i386: KVM: expand Hyper-V features early and provide simple 'hv-default=on' option Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 01/19] linux-headers: update against 5.11-rc2 Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 02/19] i386: introduce kvm_hv_evmcs_available() Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 03/19] i386: keep hyperv_vendor string up-to-date Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 04/19] i386: invert hyperv_spinlock_attempts setting logic with hv_passthrough Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 05/19] i386: always fill Hyper-V CPUID feature leaves from X86CPU data Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 06/19] i386: stop using env->features[] for filling Hyper-V CPUIDs Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 07/19] i386: introduce hyperv_feature_supported() Vitaly Kuznetsov
2021-01-07 15:06 ` [PATCH v3 08/19] i386: introduce hv_cpuid_get_host() Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 09/19] i386: drop FEAT_HYPERV feature leaves Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 10/19] i386: introduce hv_cpuid_cache Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 11/19] i386: split hyperv_handle_properties() into hyperv_expand_features()/hyperv_fill_cpuids() Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 12/19] i386: move eVMCS enablement to hyperv_init_vcpu() Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 13/19] i386: switch hyperv_expand_features() to using error_setg() Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 14/19] i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 15/19] i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 16/19] i386: use global kvm_state in hyperv_enabled() check Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 17/19] i386: expand Hyper-V features during CPU feature expansion time Vitaly Kuznetsov
2021-01-07 15:14 ` [PATCH v3 18/19] i386: provide simple 'hv-default=on' option Vitaly Kuznetsov
2021-01-15  2:11   ` Igor Mammedov
2021-01-15  9:20     ` Vitaly Kuznetsov
2021-01-20 13:13       ` Igor Mammedov
2021-01-20 14:38         ` Vitaly Kuznetsov
2021-01-20 19:08           ` Igor Mammedov
2021-01-20 20:49             ` Eduardo Habkost
2021-01-21 13:27               ` Igor Mammedov
2021-01-21 16:23                 ` Igor Mammedov
2021-01-21 17:08                 ` Eduardo Habkost
2021-01-25 13:42                   ` David Edmondson
2021-01-21  8:45             ` Vitaly Kuznetsov
2021-01-21 13:49               ` Igor Mammedov
2021-01-21 16:51                 ` Vitaly Kuznetsov
2021-01-20 19:55         ` Eduardo Habkost
2021-01-07 15:14 ` [PATCH v3 19/19] qtest/hyperv: Introduce a simple hyper-v test Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.