kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/5] KVM/x86/nVMX: Add field existence support in VMCS12
@ 2021-08-17  9:31 Robert Hoo
  2021-08-17  9:31 ` [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx Robert Hoo
                   ` (4 more replies)
  0 siblings, 5 replies; 37+ messages in thread
From: Robert Hoo @ 2021-08-17  9:31 UTC (permalink / raw)
  To: seanjc, pbonzini, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang, Robert Hoo

SDM[1] has stated that many VMCS fields' existence depend on some other
VMX feature's status.
In nested case, VMCS12 shall respect this, i.e., L0's VMCS configuration
for L1 has limited the L1's VMX "physical" capability, some vmcs12.fields
shall appear not exist when L1 vmread/vmwrite.

This patch set
1) Add a bitmap in nested_vmx to reflect vmcs12 fields' existence
2) Implement those update functions according to dependencies stated in SDM
and update dynamically
3) Make VMCS12 read/write respect this
4) Make nested MSR_IA32_VMX_VMCS_ENUM read-only and respect this

[1] Notes in SDM Vol.3, Appedix B FIELD ENCODING IN VMCS
---
This patch set is a follow-up to Sean's suggestion in ba1f82456b
(Dynamically compute max VMCS index for vmcs12).

Robert Hoo (5):
  KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx
  KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set
  KVM: x86: nVMX: VMCS12 field's read/write respects field existence
    bitmap
  KVM: x86: nVMX: Respect vmcs12 field existence when calc
    vmx_vmcs_enum_msr
  KVM: x86: nVMX: Ignore user space set value to MSR_IA32_VMX_VMCS_ENUM

 arch/x86/kvm/vmx/nested.c |  68 +++++--
 arch/x86/kvm/vmx/nested.h |   1 +
 arch/x86/kvm/vmx/vmcs12.c | 363 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmcs12.h |  69 ++++++--
 arch/x86/kvm/vmx/vmx.c    |  17 +-
 arch/x86/kvm/vmx/vmx.h    |   3 +
 6 files changed, 499 insertions(+), 22 deletions(-)


base-commit: 32bdc01988413031c6e743714c2b40bdd773e5db
-- 
2.27.0


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx
  2021-08-17  9:31 [PATCH v1 0/5] KVM/x86/nVMX: Add field existence support in VMCS12 Robert Hoo
@ 2021-08-17  9:31 ` Robert Hoo
  2021-10-20 15:10   ` Paolo Bonzini
  2021-08-17  9:31 ` [PATCH v1 2/5] KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set Robert Hoo
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-08-17  9:31 UTC (permalink / raw)
  To: seanjc, pbonzini, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang, Robert Hoo

Bits map fields:
The bitmap correspond to VMCS12 fields. To simplify logic, we don't skip VMCS
header. And as VMCS fields width differ, use the common divisor u16 as
unit, therefore this bitmap is a little sparse.

Life cycle:
The vmcs12_field_existence_bitmap share same life cycle as vCPU, i.e.
allocated when vCPU is created, and initialized; destroyed when vCPU is
about to be freed. It cannot be allocated/freed like cached_vmcs12 is
because it's needed before guest vmx_on.

Initialize/destroy:
By the nature of each field, they can be categorized into 2 types: fixed
and dynamic. These fixed has no dependency on any VMX feature while dynamic
ones have. So the initalization is divided into 2: vmcs12_field_fixed_init()
and vmcs12_field_dynamic_init().
vmcs12_field_dynamic_init() is actually a wrapper of all
vmcs12_field_update_by_xxx() functions, these update functions will be
reused in later patch when VMX msrs are set.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
 arch/x86/kvm/vmx/nested.c |   2 +
 arch/x86/kvm/vmx/vmcs12.c | 363 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmcs12.h |  26 +++
 arch/x86/kvm/vmx/vmx.c    |  12 +-
 arch/x86/kvm/vmx/vmx.h    |   3 +
 5 files changed, 405 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 0d0dd6580cfd..125b94dc3cf1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -327,6 +327,8 @@ void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu)
 {
 	vcpu_load(vcpu);
 	vmx_leave_nested(vcpu);
+	kfree(to_vmx(vcpu)->nested.vmcs12_field_existence_bitmap);
+	to_vmx(vcpu)->nested.vmcs12_field_existence_bitmap = NULL;
 	vcpu_put(vcpu);
 }
 
diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
index d9f5d7c56ae3..22fd5b21e136 100644
--- a/arch/x86/kvm/vmx/vmcs12.c
+++ b/arch/x86/kvm/vmx/vmcs12.c
@@ -153,3 +153,366 @@ const unsigned short vmcs_field_to_offset_table[] = {
 	FIELD(HOST_RIP, host_rip),
 };
 const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table);
+
+#define FIELD_BIT_SET(name, bitmap) set_bit(f_pos(name), bitmap)
+#define FIELD64_BIT_SET(name, bitmap)	\
+	do {set_bit(f_pos(name), bitmap);	\
+	    set_bit(f_pos(name) + (sizeof(u32) / sizeof(u16)), bitmap);\
+	} while (0)
+
+#define FIELD_BIT_CLEAR(name, bitmap) clear_bit(f_pos(name), bitmap)
+#define FIELD64_BIT_CLEAR(name, bitmap)	\
+	do {clear_bit(f_pos(name), bitmap);	\
+	    clear_bit(f_pos(name) + (sizeof(u32) / sizeof(u16)), bitmap);\
+	} while (0)
+
+#define FIELD_BIT_CHANGE(name, bitmap) change_bit(f_pos(name), bitmap)
+#define FIELD64_BIT_CHANGE(name, bitmap)	\
+	do {change_bit(f_pos(name), bitmap);	\
+	    change_bit(f_pos(name) + (sizeof(u32) / sizeof(u16)), bitmap);\
+	} while (0)
+
+/*
+ * Set non-dependent fields to exist
+ */
+void vmcs12_field_fixed_init(unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+	FIELD_BIT_SET(guest_es_selector, bitmap);
+	FIELD_BIT_SET(guest_cs_selector, bitmap);
+	FIELD_BIT_SET(guest_ss_selector, bitmap);
+	FIELD_BIT_SET(guest_ds_selector, bitmap);
+	FIELD_BIT_SET(guest_fs_selector, bitmap);
+	FIELD_BIT_SET(guest_gs_selector, bitmap);
+	FIELD_BIT_SET(guest_ldtr_selector, bitmap);
+	FIELD_BIT_SET(guest_tr_selector, bitmap);
+	FIELD_BIT_SET(host_es_selector, bitmap);
+	FIELD_BIT_SET(host_cs_selector, bitmap);
+	FIELD_BIT_SET(host_ss_selector, bitmap);
+	FIELD_BIT_SET(host_ds_selector, bitmap);
+	FIELD_BIT_SET(host_fs_selector, bitmap);
+	FIELD_BIT_SET(host_gs_selector, bitmap);
+	FIELD_BIT_SET(host_tr_selector, bitmap);
+	FIELD64_BIT_SET(io_bitmap_a, bitmap);
+	FIELD64_BIT_SET(io_bitmap_b, bitmap);
+	FIELD64_BIT_SET(vm_exit_msr_store_addr, bitmap);
+	FIELD64_BIT_SET(vm_exit_msr_load_addr, bitmap);
+	FIELD64_BIT_SET(vm_entry_msr_load_addr, bitmap);
+	FIELD64_BIT_SET(tsc_offset, bitmap);
+	FIELD64_BIT_SET(vmcs_link_pointer, bitmap);
+	FIELD64_BIT_SET(guest_ia32_debugctl, bitmap);
+	FIELD_BIT_SET(pin_based_vm_exec_control, bitmap);
+	FIELD_BIT_SET(cpu_based_vm_exec_control, bitmap);
+	FIELD_BIT_SET(exception_bitmap, bitmap);
+	FIELD_BIT_SET(page_fault_error_code_mask, bitmap);
+	FIELD_BIT_SET(page_fault_error_code_match, bitmap);
+	FIELD_BIT_SET(cr3_target_count, bitmap);
+	FIELD_BIT_SET(vm_exit_controls, bitmap);
+	FIELD_BIT_SET(vm_exit_msr_store_count, bitmap);
+	FIELD_BIT_SET(vm_exit_msr_load_count, bitmap);
+	FIELD_BIT_SET(vm_entry_controls, bitmap);
+	FIELD_BIT_SET(vm_entry_msr_load_count, bitmap);
+	FIELD_BIT_SET(vm_entry_intr_info_field, bitmap);
+	FIELD_BIT_SET(vm_entry_exception_error_code, bitmap);
+	FIELD_BIT_SET(vm_entry_instruction_len, bitmap);
+	FIELD_BIT_SET(vm_instruction_error, bitmap);
+	FIELD_BIT_SET(vm_exit_reason, bitmap);
+	FIELD_BIT_SET(vm_exit_intr_info, bitmap);
+	FIELD_BIT_SET(vm_exit_intr_error_code, bitmap);
+	FIELD_BIT_SET(idt_vectoring_info_field, bitmap);
+	FIELD_BIT_SET(idt_vectoring_error_code, bitmap);
+	FIELD_BIT_SET(vm_exit_instruction_len, bitmap);
+	FIELD_BIT_SET(vmx_instruction_info, bitmap);
+	FIELD_BIT_SET(guest_es_limit, bitmap);
+	FIELD_BIT_SET(guest_cs_limit, bitmap);
+	FIELD_BIT_SET(guest_ss_limit, bitmap);
+	FIELD_BIT_SET(guest_ds_limit, bitmap);
+	FIELD_BIT_SET(guest_fs_limit, bitmap);
+	FIELD_BIT_SET(guest_gs_limit, bitmap);
+	FIELD_BIT_SET(guest_ldtr_limit, bitmap);
+	FIELD_BIT_SET(guest_tr_limit, bitmap);
+	FIELD_BIT_SET(guest_gdtr_limit, bitmap);
+	FIELD_BIT_SET(guest_idtr_limit, bitmap);
+	FIELD_BIT_SET(guest_es_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_cs_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_ss_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_ds_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_fs_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_gs_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_ldtr_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_tr_ar_bytes, bitmap);
+	FIELD_BIT_SET(guest_interruptibility_info, bitmap);
+	FIELD_BIT_SET(guest_activity_state, bitmap);
+	FIELD_BIT_SET(guest_sysenter_cs, bitmap);
+	FIELD_BIT_SET(host_ia32_sysenter_cs, bitmap);
+	FIELD_BIT_SET(cr0_guest_host_mask, bitmap);
+	FIELD_BIT_SET(cr4_guest_host_mask, bitmap);
+	FIELD_BIT_SET(cr0_read_shadow, bitmap);
+	FIELD_BIT_SET(cr4_read_shadow, bitmap);
+	FIELD_BIT_SET(exit_qualification, bitmap);
+	FIELD_BIT_SET(guest_linear_address, bitmap);
+	FIELD_BIT_SET(guest_cr0, bitmap);
+	FIELD_BIT_SET(guest_cr3, bitmap);
+	FIELD_BIT_SET(guest_cr4, bitmap);
+	FIELD_BIT_SET(guest_es_base, bitmap);
+	FIELD_BIT_SET(guest_cs_base, bitmap);
+	FIELD_BIT_SET(guest_ss_base, bitmap);
+	FIELD_BIT_SET(guest_ds_base, bitmap);
+	FIELD_BIT_SET(guest_fs_base, bitmap);
+	FIELD_BIT_SET(guest_gs_base, bitmap);
+	FIELD_BIT_SET(guest_ldtr_base, bitmap);
+	FIELD_BIT_SET(guest_tr_base, bitmap);
+	FIELD_BIT_SET(guest_gdtr_base, bitmap);
+	FIELD_BIT_SET(guest_idtr_base, bitmap);
+	FIELD_BIT_SET(guest_dr7, bitmap);
+	FIELD_BIT_SET(guest_rsp, bitmap);
+	FIELD_BIT_SET(guest_rip, bitmap);
+	FIELD_BIT_SET(guest_rflags, bitmap);
+	FIELD_BIT_SET(guest_pending_dbg_exceptions, bitmap);
+	FIELD_BIT_SET(guest_sysenter_esp, bitmap);
+	FIELD_BIT_SET(guest_sysenter_eip, bitmap);
+	FIELD_BIT_SET(host_cr0, bitmap);
+	FIELD_BIT_SET(host_cr3, bitmap);
+	FIELD_BIT_SET(host_cr4, bitmap);
+	FIELD_BIT_SET(host_fs_base, bitmap);
+	FIELD_BIT_SET(host_gs_base, bitmap);
+	FIELD_BIT_SET(host_tr_base, bitmap);
+	FIELD_BIT_SET(host_gdtr_base, bitmap);
+	FIELD_BIT_SET(host_idtr_base, bitmap);
+	FIELD_BIT_SET(host_ia32_sysenter_esp, bitmap);
+	FIELD_BIT_SET(host_ia32_sysenter_eip, bitmap);
+	FIELD_BIT_SET(host_rsp, bitmap);
+	FIELD_BIT_SET(host_rip, bitmap);
+}
+
+void vmcs12_field_dynamic_init(struct nested_vmx_msrs *vmx_msrs,
+			       unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+	vmcs12_field_update_by_pinbased_ctrl(0, vmx_msrs->pinbased_ctls_high,
+					     bitmap);
+
+	vmcs12_field_update_by_procbased_ctrl(0, vmx_msrs->procbased_ctls_high,
+					      bitmap);
+
+	vmcs12_field_update_by_procbased_ctrl2(0, vmx_msrs->secondary_ctls_high,
+					       bitmap);
+
+	vmcs12_field_update_by_vmentry_ctrl(vmx_msrs->exit_ctls_high, 0,
+					    vmx_msrs->entry_ctls_high,
+					    bitmap);
+
+	vmcs12_field_update_by_vmexit_ctrl(vmx_msrs->entry_ctls_high, 0,
+					   vmx_msrs->exit_ctls_high,
+					   bitmap);
+
+	vmcs12_field_update_by_vm_func(0, vmx_msrs->vmfunc_controls, bitmap);
+}
+
+void vmcs12_field_update_by_pinbased_ctrl(u32 old_val, u32 new_val,
+					  unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+
+	if (!(old_val ^ new_val))
+		return;
+	if ((old_val ^ new_val) & PIN_BASED_POSTED_INTR) {
+		FIELD_BIT_CHANGE(posted_intr_nv, bitmap);
+		FIELD64_BIT_CHANGE(posted_intr_desc_addr, bitmap);
+	}
+
+	if ((old_val ^ new_val) & PIN_BASED_VMX_PREEMPTION_TIMER)
+		FIELD_BIT_CHANGE(vmx_preemption_timer_value, bitmap);
+}
+
+void vmcs12_field_update_by_procbased_ctrl(u32 old_val, u32 new_val,
+					   unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+	if (!(old_val ^ new_val))
+		return;
+
+	if ((old_val ^ new_val) & CPU_BASED_USE_MSR_BITMAPS)
+		FIELD64_BIT_CHANGE(msr_bitmap, bitmap);
+
+	if ((old_val ^ new_val) & CPU_BASED_TPR_SHADOW) {
+		FIELD64_BIT_CHANGE(virtual_apic_page_addr, bitmap);
+		FIELD_BIT_CHANGE(tpr_threshold, bitmap);
+	}
+
+	if ((old_val ^ new_val) &
+	    CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
+		FIELD_BIT_CHANGE(secondary_vm_exec_control, bitmap);
+	}
+}
+
+void vmcs12_field_update_by_procbased_ctrl2(u32 old_val, u32 new_val,
+					    unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+	if (!(old_val ^ new_val))
+		return;
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_ENABLE_VPID)
+		FIELD_BIT_CHANGE(virtual_processor_id, bitmap);
+
+	if ((old_val ^ new_val) &
+	    SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) {
+		FIELD_BIT_CHANGE(guest_intr_status, bitmap);
+		FIELD64_BIT_CHANGE(eoi_exit_bitmap0, bitmap);
+		FIELD64_BIT_CHANGE(eoi_exit_bitmap1, bitmap);
+		FIELD64_BIT_CHANGE(eoi_exit_bitmap2, bitmap);
+		FIELD64_BIT_CHANGE(eoi_exit_bitmap3, bitmap);
+	}
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_ENABLE_PML) {
+		FIELD_BIT_CHANGE(guest_pml_index, bitmap);
+		FIELD64_BIT_CHANGE(pml_address, bitmap);
+	}
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)
+		FIELD64_BIT_CHANGE(apic_access_addr, bitmap);
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_ENABLE_VMFUNC)
+		FIELD64_BIT_CHANGE(vm_function_control, bitmap);
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_ENABLE_EPT) {
+		FIELD64_BIT_CHANGE(ept_pointer, bitmap);
+		FIELD64_BIT_CHANGE(guest_physical_address, bitmap);
+		FIELD64_BIT_CHANGE(guest_pdptr0, bitmap);
+		FIELD64_BIT_CHANGE(guest_pdptr1, bitmap);
+		FIELD64_BIT_CHANGE(guest_pdptr2, bitmap);
+		FIELD64_BIT_CHANGE(guest_pdptr3, bitmap);
+	}
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_SHADOW_VMCS) {
+		FIELD64_BIT_CHANGE(vmread_bitmap, bitmap);
+		FIELD64_BIT_CHANGE(vmwrite_bitmap, bitmap);
+	}
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_XSAVES)
+		FIELD64_BIT_CHANGE(xss_exit_bitmap, bitmap);
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_ENCLS_EXITING)
+		FIELD64_BIT_CHANGE(encls_exiting_bitmap, bitmap);
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_TSC_SCALING)
+		FIELD64_BIT_CHANGE(tsc_multiplier, bitmap);
+
+	if ((old_val ^ new_val) & SECONDARY_EXEC_PAUSE_LOOP_EXITING) {
+		FIELD64_BIT_CHANGE(vmread_bitmap, bitmap);
+		FIELD64_BIT_CHANGE(vmwrite_bitmap, bitmap);
+	}
+}
+
+void vmcs12_field_update_by_vmentry_ctrl(u32 vm_exit_ctrl, u32 old_val,
+					 u32 new_val, unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+	if (!(old_val ^ new_val))
+		return;
+
+	if ((old_val ^ new_val) & VM_ENTRY_LOAD_IA32_PAT) {
+		if ((new_val & VM_ENTRY_LOAD_IA32_PAT) ||
+		    (vm_exit_ctrl & VM_EXIT_SAVE_IA32_PAT))
+			FIELD64_BIT_SET(guest_ia32_pat, bitmap);
+		else
+			FIELD64_BIT_CLEAR(guest_ia32_pat, bitmap);
+	}
+
+	if ((old_val ^ new_val) & VM_ENTRY_LOAD_IA32_EFER) {
+		if ((new_val & VM_ENTRY_LOAD_IA32_EFER) ||
+		    (vm_exit_ctrl & VM_EXIT_SAVE_IA32_EFER))
+			FIELD64_BIT_SET(guest_ia32_efer, bitmap);
+		else
+			FIELD64_BIT_CLEAR(guest_ia32_efer, bitmap);
+	}
+
+	if ((old_val ^ new_val) & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
+		FIELD64_BIT_CHANGE(guest_ia32_perf_global_ctrl, bitmap);
+
+	if ((old_val ^ new_val) & VM_ENTRY_LOAD_BNDCFGS) {
+		if ((new_val & VM_ENTRY_LOAD_BNDCFGS) ||
+		    (vm_exit_ctrl & VM_EXIT_CLEAR_BNDCFGS))
+			FIELD64_BIT_SET(guest_bndcfgs, bitmap);
+		else
+			FIELD64_BIT_CLEAR(guest_bndcfgs, bitmap);
+	}
+}
+
+void vmcs12_field_update_by_vmexit_ctrl(u32 vm_entry_ctrl, u32 old_val,
+					u32 new_val, unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+	if (!(old_val ^ new_val))
+		return;
+
+	if ((old_val ^ new_val) & VM_EXIT_LOAD_IA32_PAT)
+		FIELD64_BIT_CHANGE(host_ia32_pat, bitmap);
+
+	if ((old_val ^ new_val) & VM_EXIT_LOAD_IA32_EFER)
+		FIELD64_BIT_CHANGE(host_ia32_efer, bitmap);
+
+	if ((old_val ^ new_val) & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
+		FIELD64_BIT_CHANGE(host_ia32_perf_global_ctrl, bitmap);
+
+	if ((old_val ^ new_val) & VM_EXIT_SAVE_IA32_PAT) {
+		if ((new_val & VM_EXIT_SAVE_IA32_PAT) ||
+		    (vm_entry_ctrl & VM_ENTRY_LOAD_IA32_PAT))
+			FIELD64_BIT_SET(guest_ia32_pat, bitmap);
+		else
+			FIELD64_BIT_CLEAR(guest_ia32_pat, bitmap);
+	}
+
+	if ((old_val ^ new_val) & VM_EXIT_SAVE_IA32_EFER) {
+		if ((new_val & VM_EXIT_SAVE_IA32_EFER) ||
+		    (vm_entry_ctrl & VM_ENTRY_LOAD_IA32_EFER))
+			FIELD64_BIT_SET(guest_ia32_efer, bitmap);
+		else
+			FIELD64_BIT_CLEAR(guest_ia32_efer, bitmap);
+	}
+
+	if ((old_val ^ new_val) & VM_EXIT_CLEAR_BNDCFGS) {
+		if ((new_val & VM_EXIT_CLEAR_BNDCFGS) ||
+		    (vm_entry_ctrl & VM_ENTRY_LOAD_BNDCFGS))
+			FIELD64_BIT_SET(guest_bndcfgs, bitmap);
+		else
+			FIELD64_BIT_CLEAR(guest_bndcfgs, bitmap);
+	}
+}
+
+void vmcs12_field_update_by_vm_func(u64 old_val, u64 new_val,
+				    unsigned long *bitmap)
+{
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return;
+	}
+
+	if (!(old_val ^ new_val))
+		return;
+
+	if ((old_val ^ new_val) & VMFUNC_CONTROL_BIT(EPTP_SWITCHING))
+		FIELD64_BIT_CHANGE(eptp_list_address, bitmap);
+}
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 5e0e1b39f495..5c39370dff3c 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -187,6 +187,32 @@ struct __packed vmcs12 {
 	u16 guest_pml_index;
 };
 
+/*
+ * In unit of u16, each vmcs12 field's offset.
+ * Used to index each's position in bitmap
+ */
+#define f_pos(x)	(offsetof(struct vmcs12, x) / sizeof(u16))
+#define VMCS12_FIELD_BITMAP_SIZE	\
+	(sizeof(struct vmcs12) / sizeof(u16) / BITS_PER_BYTE)
+void vmcs12_field_fixed_init(unsigned long *bitmap);
+void vmcs12_field_dynamic_init(struct nested_vmx_msrs *vmx_msrs,
+					unsigned long *bitmap);
+void vmcs12_field_update_by_pinbased_ctrl(u32 old_val, u32 new_val,
+							unsigned long *bitmap);
+void vmcs12_field_update_by_procbased_ctrl(u32 old_val, u32 new_val,
+							unsigned long *bitmap);
+void vmcs12_field_update_by_procbased_ctrl2(u32 old_val, u32 new_val,
+							unsigned long *bitmap);
+void vmcs12_field_update_by_vmentry_ctrl(u32 vm_exit_ctrl, u32 old_val,
+						      u32 new_val,
+						      unsigned long *bitmap);
+void vmcs12_field_update_by_vmexit_ctrl(u32 vm_entry_ctrl, u32 old_val,
+						     u32 new_val,
+						     unsigned long *bitmap);
+void vmcs12_field_update_by_vm_func(u64 old_val, u64 new_val,
+						unsigned long *bitmap);
+
+
 /*
  * VMCS12_REVISION is an arbitrary id that should be changed if the content or
  * layout of struct vmcs12 is changed. MSR_IA32_VMX_BASIC returns this id, and
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ae8e62df16dd..6ab37e1d04c9 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6844,8 +6844,17 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			goto free_vmcs;
 	}
 
-	if (nested)
+	if (nested) {
 		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
+
+		vmx->nested.vmcs12_field_existence_bitmap = (unsigned long *)
+			kzalloc(VMCS12_FIELD_BITMAP_SIZE, GFP_KERNEL_ACCOUNT);
+		if (!vmx->nested.vmcs12_field_existence_bitmap)
+			goto free_vmcs;
+		vmcs12_field_fixed_init(vmx->nested.vmcs12_field_existence_bitmap);
+		vmcs12_field_dynamic_init(&vmx->nested.msrs,
+					  vmx->nested.vmcs12_field_existence_bitmap);
+	}
 	else
 		memset(&vmx->nested.msrs, 0, sizeof(vmx->nested.msrs));
 
@@ -6867,6 +6876,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 
 	return 0;
 
+	kfree(vmx->nested.cached_shadow_vmcs12);
 free_vmcs:
 	free_loaded_vmcs(vmx->loaded_vmcs);
 free_pml:
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 0ecc41189292..c34f1310aa36 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -141,6 +141,9 @@ struct nested_vmx {
 	 */
 	struct vmcs12 *cached_shadow_vmcs12;
 
+	/* VMCS12 field existence bitmap */
+	unsigned long *vmcs12_field_existence_bitmap;
+
 	/*
 	 * Indicates if the shadow vmcs or enlightened vmcs must be updated
 	 * with the data held by struct vmcs12.
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 2/5] KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set
  2021-08-17  9:31 [PATCH v1 0/5] KVM/x86/nVMX: Add field existence support in VMCS12 Robert Hoo
  2021-08-17  9:31 ` [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx Robert Hoo
@ 2021-08-17  9:31 ` Robert Hoo
  2021-10-20 15:11   ` Paolo Bonzini
  2021-08-17  9:31 ` [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap Robert Hoo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-08-17  9:31 UTC (permalink / raw)
  To: seanjc, pbonzini, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang, Robert Hoo

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
 arch/x86/kvm/vmx/nested.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 125b94dc3cf1..b8121f8f6d96 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1262,6 +1262,34 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
 
 	*lowp = data;
 	*highp = data >> 32;
+
+	switch (msr_index) {
+	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
+		vmcs12_field_update_by_pinbased_ctrl(*highp,
+				data >> 32,
+				vmx->nested.vmcs12_field_existence_bitmap);
+		break;
+	case MSR_IA32_VMX_TRUE_PROCBASED_CTLS:
+		vmcs12_field_update_by_procbased_ctrl(*highp,
+				data >> 32,
+				vmx->nested.vmcs12_field_existence_bitmap);
+		break;
+	case MSR_IA32_VMX_PROCBASED_CTLS2:
+		vmcs12_field_update_by_procbased_ctrl2(*highp,
+				data >> 32,
+				vmx->nested.vmcs12_field_existence_bitmap);
+		break;
+	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
+		vmcs12_field_update_by_vmexit_ctrl(vmx->nested.msrs.entry_ctls_high,
+				*highp, data >> 32,
+				vmx->nested.vmcs12_field_existence_bitmap);
+		break;
+	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
+		vmcs12_field_update_by_vmentry_ctrl(vmx->nested.msrs.exit_ctls_high,
+				*highp, data >> 32,
+				vmx->nested.vmcs12_field_existence_bitmap);
+		break;
+	}
 	return 0;
 }
 
@@ -1403,6 +1431,8 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 	case MSR_IA32_VMX_VMFUNC:
 		if (data & ~vmx->nested.msrs.vmfunc_controls)
 			return -EINVAL;
+		vmcs12_field_update_by_vm_func(vmx->nested.msrs.vmfunc_controls,
+				data, vmx->nested.vmcs12_field_existence_bitmap);
 		vmx->nested.msrs.vmfunc_controls = data;
 		return 0;
 	default:
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-17  9:31 [PATCH v1 0/5] KVM/x86/nVMX: Add field existence support in VMCS12 Robert Hoo
  2021-08-17  9:31 ` [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx Robert Hoo
  2021-08-17  9:31 ` [PATCH v1 2/5] KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set Robert Hoo
@ 2021-08-17  9:31 ` Robert Hoo
  2021-08-17 15:54   ` Sean Christopherson
  2021-08-17  9:31 ` [PATCH v1 4/5] KVM: x86: nVMX: Respect vmcs12 field existence when calc vmx_vmcs_enum_msr Robert Hoo
  2021-08-17  9:31 ` [PATCH v1 5/5] KVM: x86: nVMX: Ignore user space set value to MSR_IA32_VMX_VMCS_ENUM Robert Hoo
  4 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-08-17  9:31 UTC (permalink / raw)
  To: seanjc, pbonzini, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang, Robert Hoo

In vmcs12_{read,write}_any(), check the field exist or not. If not, return
failure. Hence their function prototype changed a little accordingly.
In handle_vm{read,write}(), above function's caller, check return value, if
failed, emulate nested vmx fail with instruction error of
VMXERR_UNSUPPORTED_VMCS_COMPONENT.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
 arch/x86/kvm/vmx/nested.c | 20 ++++++++++++------
 arch/x86/kvm/vmx/vmcs12.h | 43 ++++++++++++++++++++++++++++++---------
 2 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b8121f8f6d96..9a35953ede22 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1547,7 +1547,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
 	for (i = 0; i < max_shadow_read_write_fields; i++) {
 		field = shadow_read_write_fields[i];
 		val = __vmcs_readl(field.encoding);
-		vmcs12_write_any(vmcs12, field.encoding, field.offset, val);
+		vmcs12_write_any(vmcs12, field.encoding, field.offset, val,
+				 vmx->nested.vmcs12_field_existence_bitmap);
 	}
 
 	vmcs_clear(shadow_vmcs);
@@ -1580,8 +1581,9 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
 	for (q = 0; q < ARRAY_SIZE(fields); q++) {
 		for (i = 0; i < max_fields[q]; i++) {
 			field = fields[q][i];
-			val = vmcs12_read_any(vmcs12, field.encoding,
-					      field.offset);
+			vmcs12_read_any(vmcs12, field.encoding,
+					      field.offset, &val,
+					      vmx->nested.vmcs12_field_existence_bitmap);
 			__vmcs_writel(field.encoding, val);
 		}
 	}
@@ -5070,7 +5072,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct x86_exception e;
 	unsigned long field;
-	u64 value;
+	unsigned long value;
 	gva_t gva = 0;
 	short offset;
 	int len, r;
@@ -5098,7 +5100,10 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
 		copy_vmcs02_to_vmcs12_rare(vcpu, vmcs12);
 
 	/* Read the field, zero-extended to a u64 value */
-	value = vmcs12_read_any(vmcs12, field, offset);
+	r = vmcs12_read_any(vmcs12, field, offset, &value,
+				vmx->nested.vmcs12_field_existence_bitmap);
+	if (r < 0)
+		return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);
 
 	/*
 	 * Now copy part of this value to register or memory, as requested.
@@ -5223,7 +5228,10 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 	if (field >= GUEST_ES_AR_BYTES && field <= GUEST_TR_AR_BYTES)
 		value &= 0x1f0ff;
 
-	vmcs12_write_any(vmcs12, field, offset, value);
+	r = vmcs12_write_any(vmcs12, field, offset, value,
+			 vmx->nested.vmcs12_field_existence_bitmap);
+	if (r < 0)
+		return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);
 
 	/*
 	 * Do not track vmcs12 dirty-state if in guest-mode as we actually
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 5c39370dff3c..9ac3d6ac1b6b 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -413,31 +413,51 @@ static inline short vmcs_field_to_offset(unsigned long field)
 
 #undef ROL16
 
-static inline u64 vmcs12_read_any(struct vmcs12 *vmcs12, unsigned long field,
-				  u16 offset)
+static inline int vmcs12_read_any(struct vmcs12 *vmcs12, unsigned long field,
+				  u16 offset, unsigned long *value, unsigned long *bitmap)
 {
 	char *p = (char *)vmcs12 + offset;
 
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("vmcs12 read: NULL bitmap");
+		return -EINVAL;
+	}
+	if (!test_bit(offset / sizeof(u16), bitmap))
+		return -ENOENT;
+
 	switch (vmcs_field_width(field)) {
 	case VMCS_FIELD_WIDTH_NATURAL_WIDTH:
-		return *((natural_width *)p);
+		*value = *((natural_width *)p);
+		break;
 	case VMCS_FIELD_WIDTH_U16:
-		return *((u16 *)p);
+		*value = *((u16 *)p);
+		break;
 	case VMCS_FIELD_WIDTH_U32:
-		return *((u32 *)p);
+		*value = *((u32 *)p);
+		break;
 	case VMCS_FIELD_WIDTH_U64:
-		return *((u64 *)p);
+		*value = *((u64 *)p);
+		break;
 	default:
 		WARN_ON_ONCE(1);
-		return -1;
+		return -ENOENT;
 	}
+
+	return 0;
 }
 
-static inline void vmcs12_write_any(struct vmcs12 *vmcs12, unsigned long field,
-				    u16 offset, u64 field_value)
+static inline int vmcs12_write_any(struct vmcs12 *vmcs12, unsigned long field,
+				    u16 offset, u64 field_value, unsigned long *bitmap)
 {
 	char *p = (char *)vmcs12 + offset;
 
+	if (unlikely(bitmap == NULL)) {
+		pr_err_once("%s: NULL bitmap", __func__);
+		return -EINVAL;
+	}
+	if (!test_bit(offset / sizeof(u16), bitmap))
+		return -ENOENT;
+
 	switch (vmcs_field_width(field)) {
 	case VMCS_FIELD_WIDTH_U16:
 		*(u16 *)p = field_value;
@@ -453,8 +473,11 @@ static inline void vmcs12_write_any(struct vmcs12 *vmcs12, unsigned long field,
 		break;
 	default:
 		WARN_ON_ONCE(1);
-		break;
+		return -ENOENT;
 	}
+
+	return 0;
 }
 
+
 #endif /* __KVM_X86_VMX_VMCS12_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 4/5] KVM: x86: nVMX: Respect vmcs12 field existence when calc vmx_vmcs_enum_msr
  2021-08-17  9:31 [PATCH v1 0/5] KVM/x86/nVMX: Add field existence support in VMCS12 Robert Hoo
                   ` (2 preceding siblings ...)
  2021-08-17  9:31 ` [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap Robert Hoo
@ 2021-08-17  9:31 ` Robert Hoo
  2021-08-17  9:31 ` [PATCH v1 5/5] KVM: x86: nVMX: Ignore user space set value to MSR_IA32_VMX_VMCS_ENUM Robert Hoo
  4 siblings, 0 replies; 37+ messages in thread
From: Robert Hoo @ 2021-08-17  9:31 UTC (permalink / raw)
  To: seanjc, pbonzini, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang, Robert Hoo

Check each fields existence when calculating vmx_vmcs_enum_msr.
Note, in initial nested VMX Ctrl MSRs setup, the early stage before VM is
created, we have no idea about VMX features user space would set, therefore
set to raw physical MSR's value for user space's reference.
After vCPU features are settled, we update dynamic field's existence.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
 arch/x86/kvm/vmx/nested.c | 15 ++++++++++++---
 arch/x86/kvm/vmx/nested.h |  1 +
 arch/x86/kvm/vmx/vmx.c    |  5 ++++-
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9a35953ede22..9a733c703662 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6421,8 +6421,7 @@ void nested_vmx_set_vmcs_shadowing_bitmap(void)
  * that madness to get the encoding for comparison.
  */
 #define VMCS12_IDX_TO_ENC(idx) ((u16)(((u16)(idx) >> 6) | ((u16)(idx) << 10)))
-
-static u64 nested_vmx_calc_vmcs_enum_msr(void)
+u64 nested_vmx_calc_vmcs_enum_msr(struct nested_vmx *nvmx)
 {
 	/*
 	 * Note these are the so called "index" of the VMCS field encoding, not
@@ -6442,6 +6441,15 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
 		if (!vmcs_field_to_offset_table[i])
 			continue;
 
+		if (unlikely(!nvmx->vmcs12_field_existence_bitmap)) {
+			WARN_ON(1);
+			break;
+		}
+
+		if (!test_bit(vmcs_field_to_offset_table[i] / sizeof(u16),
+		    nvmx->vmcs12_field_existence_bitmap))
+			continue;
+
 		idx = vmcs_field_index(VMCS12_IDX_TO_ENC(i));
 		if (idx > max_idx)
 			max_idx = idx;
@@ -6695,7 +6703,8 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	rdmsrl(MSR_IA32_VMX_CR0_FIXED1, msrs->cr0_fixed1);
 	rdmsrl(MSR_IA32_VMX_CR4_FIXED1, msrs->cr4_fixed1);
 
-	msrs->vmcs_enum = nested_vmx_calc_vmcs_enum_msr();
+	/* In initial setup, simply read HW value for reference */
+	rdmsrl(MSR_IA32_VMX_VMCS_ENUM, msrs->vmcs_enum);
 }
 
 void nested_vmx_hardware_unsetup(void)
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index b69a80f43b37..34235d276aad 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -36,6 +36,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
 void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
 bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
 				 int size);
+u64 nested_vmx_calc_vmcs_enum_msr(struct nested_vmx *nvmx);
 
 static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6ab37e1d04c9..f44a4971cc8d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7156,10 +7156,13 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		vmcs_set_secondary_exec_control(vmx);
 	}
 
-	if (nested_vmx_allowed(vcpu))
+	if (nested_vmx_allowed(vcpu)) {
 		to_vmx(vcpu)->msr_ia32_feature_control_valid_bits |=
 			FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
 			FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
+		to_vmx(vcpu)->nested.msrs.vmcs_enum =
+			nested_vmx_calc_vmcs_enum_msr(&to_vmx(vcpu)->nested);
+	}
 	else
 		to_vmx(vcpu)->msr_ia32_feature_control_valid_bits &=
 			~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 5/5] KVM: x86: nVMX: Ignore user space set value to MSR_IA32_VMX_VMCS_ENUM
  2021-08-17  9:31 [PATCH v1 0/5] KVM/x86/nVMX: Add field existence support in VMCS12 Robert Hoo
                   ` (3 preceding siblings ...)
  2021-08-17  9:31 ` [PATCH v1 4/5] KVM: x86: nVMX: Respect vmcs12 field existence when calc vmx_vmcs_enum_msr Robert Hoo
@ 2021-08-17  9:31 ` Robert Hoo
  4 siblings, 0 replies; 37+ messages in thread
From: Robert Hoo @ 2021-08-17  9:31 UTC (permalink / raw)
  To: seanjc, pbonzini, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang, Robert Hoo

This is actually "read-only" MSR, its value is determined by VMCS12 fields
existence.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
 arch/x86/kvm/vmx/nested.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9a733c703662..4d13e19d7677 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1426,7 +1426,6 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 	case MSR_IA32_VMX_EPT_VPID_CAP:
 		return vmx_restore_vmx_ept_vpid_cap(vmx, data);
 	case MSR_IA32_VMX_VMCS_ENUM:
-		vmx->nested.msrs.vmcs_enum = data;
 		return 0;
 	case MSR_IA32_VMX_VMFUNC:
 		if (data & ~vmx->nested.msrs.vmfunc_controls)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-17  9:31 ` [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap Robert Hoo
@ 2021-08-17 15:54   ` Sean Christopherson
  2021-08-18  5:50     ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-08-17 15:54 UTC (permalink / raw)
  To: Robert Hoo; +Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Tue, Aug 17, 2021, Robert Hoo wrote:
> In vmcs12_{read,write}_any(), check the field exist or not. If not, return
> failure. Hence their function prototype changed a little accordingly.
> In handle_vm{read,write}(), above function's caller, check return value, if
> failed, emulate nested vmx fail with instruction error of
> VMXERR_UNSUPPORTED_VMCS_COMPONENT.
> 
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>

Assuming Yu is a co-author, this needs to be:

  Co-developed-by: Yu Zhang <yu.c.zhang@linux.intel.com>
  Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
  Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>

See "When to use Acked-by:, Cc:, and Co-developed-by:" in
Documentation/process/submitting-patches.rst.

> ---
>  arch/x86/kvm/vmx/nested.c | 20 ++++++++++++------
>  arch/x86/kvm/vmx/vmcs12.h | 43 ++++++++++++++++++++++++++++++---------
>  2 files changed, 47 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index b8121f8f6d96..9a35953ede22 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -1547,7 +1547,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
>  	for (i = 0; i < max_shadow_read_write_fields; i++) {
>  		field = shadow_read_write_fields[i];
>  		val = __vmcs_readl(field.encoding);
> -		vmcs12_write_any(vmcs12, field.encoding, field.offset, val);
> +		vmcs12_write_any(vmcs12, field.encoding, field.offset, val,
> +				 vmx->nested.vmcs12_field_existence_bitmap);

There is no need to perform existence checks when KVM is copying to/from vmcs12,
the checks are only needed for VMREAD and VMWRITE.  Architecturally, the VMCS is
an opaque blob, software cannot rely on any assumptions about its layout or data,
i.e. KVM is free to read/write whatever it wants.   VMREAD and VMWRITE need to be
enforced because architecturally they are defined to fail if the field does not exist.

Limiting this to VMREAD/VMWRITE means we shouldn't need a bitmap and can use a
more static lookup, e.g. a switch statement.  And an idea to optimize for fields
that unconditionally exist would be to use bit 0 in the field->offset table to
denote conditional fields, e.g. the VMREAD/VMRITE lookups could be something like:

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index bc6327950657..ef8c48f80d1a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -5064,7 +5064,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
        /* Decode instruction info and find the field to read */
        field = kvm_register_read(vcpu, (((instr_info) >> 28) & 0xf));

-       offset = vmcs_field_to_offset(field);
+       offset = vmcs_field_to_offset(vmx, field);
        if (offset < 0)
                return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);

@@ -5167,7 +5167,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)

        field = kvm_register_read(vcpu, (((instr_info) >> 28) & 0xf));

-       offset = vmcs_field_to_offset(field);
+       offset = vmcs_field_to_offset(vmx, field);
        if (offset < 0)
                return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);

diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 2a45f026ee11..3c27631e0119 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -364,7 +364,8 @@ static inline void vmx_check_vmcs12_offsets(void)
 extern const unsigned short vmcs_field_to_offset_table[];
 extern const unsigned int nr_vmcs12_fields;

-static inline short vmcs_field_to_offset(unsigned long field)
+static inline short vmcs_field_to_offset(struct vcpu_vmx *vmx,
+                                        unsigned long field)
 {
        unsigned short offset;
        unsigned int index;
@@ -378,9 +379,10 @@ static inline short vmcs_field_to_offset(unsigned long field)

        index = array_index_nospec(index, nr_vmcs12_fields);
        offset = vmcs_field_to_offset_table[index];
-       if (offset == 0)
+       if (offset == 0 ||
+           ((offset & 1) && !vmcs12_field_exists(vmx, field)))
                return -ENOENT;
-       return offset;
+       return offset & ~1;
 }

 static inline u64 vmcs12_read_any(struct vmcs12 *vmcs12, unsigned long field,

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-17 15:54   ` Sean Christopherson
@ 2021-08-18  5:50     ` Robert Hoo
  2021-08-18 23:10       ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-08-18  5:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Tue, 2021-08-17 at 15:54 +0000, Sean Christopherson wrote:
> On Tue, Aug 17, 2021, Robert Hoo wrote:
> > In vmcs12_{read,write}_any(), check the field exist or not. If not,
> > return
> > failure. Hence their function prototype changed a little
> > accordingly.
> > In handle_vm{read,write}(), above function's caller, check return
> > value, if
> > failed, emulate nested vmx fail with instruction error of
> > VMXERR_UNSUPPORTED_VMCS_COMPONENT.
> > 
> > Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> 
> Assuming Yu is a co-author, this needs to be:
> 
>   Co-developed-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>   Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>   Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> 
> See "When to use Acked-by:, Cc:, and Co-developed-by:" in
> Documentation/process/submitting-patches.rst.
OK, thanks.

> 
> > ---
> >  arch/x86/kvm/vmx/nested.c | 20 ++++++++++++------
> >  arch/x86/kvm/vmx/vmcs12.h | 43 ++++++++++++++++++++++++++++++-----
> > ----
> >  2 files changed, 47 insertions(+), 16 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index b8121f8f6d96..9a35953ede22 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -1547,7 +1547,8 @@ static void copy_shadow_to_vmcs12(struct
> > vcpu_vmx *vmx)
> >  	for (i = 0; i < max_shadow_read_write_fields; i++) {
> >  		field = shadow_read_write_fields[i];
> >  		val = __vmcs_readl(field.encoding);
> > -		vmcs12_write_any(vmcs12, field.encoding, field.offset,
> > val);
> > +		vmcs12_write_any(vmcs12, field.encoding, field.offset,
> > val,
> > +				 vmx-
> > >nested.vmcs12_field_existence_bitmap);
> 
> There is no need to perform existence checks when KVM is copying
> to/from vmcs12,
> the checks are only needed for VMREAD and VMWRITE.  Architecturally,
> the VMCS is
> an opaque blob, software cannot rely on any assumptions about its
> layout or data,
> i.e. KVM is free to read/write whatever it wants.   VMREAD and
> VMWRITE need to be
> enforced because architecturally they are defined to fail if the
> field does not exist.
OK, agree.

> 
> Limiting this to VMREAD/VMWRITE means we shouldn't need a bitmap and
> can use a
> more static lookup, e.g. a switch statement.  
Emm, hard for me to choose:

Your approach sounds more efficient for CPU: Once VMX MSR's updated, no
bother to update the bitmap. Each field's existence check will directly
consult related VMX MSR. Well, the switch statement will be long...

My this implementation: once VMX MSR's updated, the update needs to be
passed to bitmap, this is 1 extra step comparing to aforementioned
above. But, later, when query field existence, especially the those
consulting vm{entry,exit}_ctrl, they usually would have to consult both
MSRs if otherwise no bitmap, and we cannot guarantee if in the future
there's no more complicated dependencies. If using bitmap, this consult
is just 1-bit reading. If no bitmap, several MSR's read and compare
happen.
And, VMX MSR --> bitmap, usually happens only once when vCPU model is
settled. But VMRead/VMWrite might happen frequently, depends on guest
itself. I'd rather leave complicated comparison in former than in
later.


> And an idea to optimize for fields
> that unconditionally exist would be to use bit 0 in the field->offset 
> table to
> denote conditional fields, e.g. the VMREAD/VMRITE lookups could be
> something like:
Though all fields offset is even today, can we assert no new odd-offset 
field won't be added some day?
And, what if some day, some field's conditional/unconditional existence
depends on CPU model?

> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index bc6327950657..ef8c48f80d1a 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -5064,7 +5064,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
>         /* Decode instruction info and find the field to read */
>         field = kvm_register_read(vcpu, (((instr_info) >> 28) &
> 0xf));
> 
> -       offset = vmcs_field_to_offset(field);
> +       offset = vmcs_field_to_offset(vmx, field);
>         if (offset < 0)
>                 return nested_vmx_fail(vcpu,
> VMXERR_UNSUPPORTED_VMCS_COMPONENT);
> 
> @@ -5167,7 +5167,7 @@ static int handle_vmwrite(struct kvm_vcpu
> *vcpu)
> 
>         field = kvm_register_read(vcpu, (((instr_info) >> 28) &
> 0xf));
> 
> -       offset = vmcs_field_to_offset(field);
> +       offset = vmcs_field_to_offset(vmx, field);
>         if (offset < 0)
>                 return nested_vmx_fail(vcpu,
> VMXERR_UNSUPPORTED_VMCS_COMPONENT);
> 
> diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
> index 2a45f026ee11..3c27631e0119 100644
> --- a/arch/x86/kvm/vmx/vmcs12.h
> +++ b/arch/x86/kvm/vmx/vmcs12.h
> @@ -364,7 +364,8 @@ static inline void vmx_check_vmcs12_offsets(void)
>  extern const unsigned short vmcs_field_to_offset_table[];
>  extern const unsigned int nr_vmcs12_fields;
> 
> -static inline short vmcs_field_to_offset(unsigned long field)
> +static inline short vmcs_field_to_offset(struct vcpu_vmx *vmx,
> +                                        unsigned long field)
>  {
>         unsigned short offset;
>         unsigned int index;
> @@ -378,9 +379,10 @@ static inline short
> vmcs_field_to_offset(unsigned long field)
> 
>         index = array_index_nospec(index, nr_vmcs12_fields);
>         offset = vmcs_field_to_offset_table[index];
> -       if (offset == 0)
> +       if (offset == 0 ||
> +           ((offset & 1) && !vmcs12_field_exists(vmx, field)))
>                 return -ENOENT;
> -       return offset;
> +       return offset & ~1;
>  }
> 
>  static inline u64 vmcs12_read_any(struct vmcs12 *vmcs12, unsigned
> long field,


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-18  5:50     ` Robert Hoo
@ 2021-08-18 23:10       ` Sean Christopherson
  2021-08-18 23:45         ` Jim Mattson
  2021-08-19  9:58         ` Robert Hoo
  0 siblings, 2 replies; 37+ messages in thread
From: Sean Christopherson @ 2021-08-18 23:10 UTC (permalink / raw)
  To: Robert Hoo; +Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Wed, Aug 18, 2021, Robert Hoo wrote:
> > Limiting this to VMREAD/VMWRITE means we shouldn't need a bitmap and
> > can use a more static lookup, e.g. a switch statement.  
> Emm, hard for me to choose:
> 
> Your approach sounds more efficient for CPU: Once VMX MSR's updated, no
> bother to update the bitmap. Each field's existence check will directly
> consult related VMX MSR. Well, the switch statement will be long...

How long?  Honest question, off the top of my head I don't have a feel for how
many fields conditionally exist.

> My this implementation: once VMX MSR's updated, the update needs to be
> passed to bitmap, this is 1 extra step comparing to aforementioned
> above. But, later, when query field existence, especially the those
> consulting vm{entry,exit}_ctrl, they usually would have to consult both
> MSRs if otherwise no bitmap, and we cannot guarantee if in the future
> there's no more complicated dependencies. If using bitmap, this consult
> is just 1-bit reading. If no bitmap, several MSR's read and compare
> happen.

Yes, but the bitmap is per-VM and likely may or may not be cache-hot for back-to-back
VMREAD/VMWRITE to different fields, whereas the shadow controls are much more likely
to reside somewhere in the caches.

> And, VMX MSR --> bitmap, usually happens only once when vCPU model is
> settled. But VMRead/VMWrite might happen frequently, depends on guest
> itself. I'd rather leave complicated comparison in former than in
> later.

I'm not terribly concerned about the runtime performance, it's the extra per-VM
allocation for something that's not thaaat interesting that I don't like.

And for performance, most of the frequently accessed VMCS fields will be shadowed
anyways, i.e. won't VM-Exit in the first place.

And that brings up another wrinkle.  The shadow VMCS bitmaps are global across
all VMs, e.g. if the preemption timer is supported in hardware but hidden from
L1, then a misbehaving L1 can VMREAD/VMWRITE the field even with this patch.
If it was just the preemption timer we could consider disabling shadow VMCS for
the VM ifthe timer exists but is hidden from L1, but GUEST_PML_INDEX and
GUEST_INTR_STATUS are also conditional :-(

Maybe there's a middle ground, e.g. let userspace tell KVM which fields it plans
on exposing to L1, use that to build the bitmaps, and disable shadow VMCS if
userspace creates VMs that don't match the specified configuration.  Burning
three more pages per VM isn't very enticing...

This is quite the complicated mess for something I'm guessing no one actually
cares about.  At what point do we chalk this up as a virtualization hole and
sweep it under the rug?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-18 23:10       ` Sean Christopherson
@ 2021-08-18 23:45         ` Jim Mattson
  2021-08-18 23:49           ` Sean Christopherson
  2021-08-19  9:58         ` Robert Hoo
  1 sibling, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2021-08-18 23:45 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Wed, Aug 18, 2021 at 4:11 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Aug 18, 2021, Robert Hoo wrote:
> > > Limiting this to VMREAD/VMWRITE means we shouldn't need a bitmap and
> > > can use a more static lookup, e.g. a switch statement.
> > Emm, hard for me to choose:
> >
> > Your approach sounds more efficient for CPU: Once VMX MSR's updated, no
> > bother to update the bitmap. Each field's existence check will directly
> > consult related VMX MSR. Well, the switch statement will be long...
>
> How long?  Honest question, off the top of my head I don't have a feel for how
> many fields conditionally exist.
>
> > My this implementation: once VMX MSR's updated, the update needs to be
> > passed to bitmap, this is 1 extra step comparing to aforementioned
> > above. But, later, when query field existence, especially the those
> > consulting vm{entry,exit}_ctrl, they usually would have to consult both
> > MSRs if otherwise no bitmap, and we cannot guarantee if in the future
> > there's no more complicated dependencies. If using bitmap, this consult
> > is just 1-bit reading. If no bitmap, several MSR's read and compare
> > happen.
>
> Yes, but the bitmap is per-VM and likely may or may not be cache-hot for back-to-back
> VMREAD/VMWRITE to different fields, whereas the shadow controls are much more likely
> to reside somewhere in the caches.
>
> > And, VMX MSR --> bitmap, usually happens only once when vCPU model is
> > settled. But VMRead/VMWrite might happen frequently, depends on guest
> > itself. I'd rather leave complicated comparison in former than in
> > later.
>
> I'm not terribly concerned about the runtime performance, it's the extra per-VM
> allocation for something that's not thaaat interesting that I don't like.
>
> And for performance, most of the frequently accessed VMCS fields will be shadowed
> anyways, i.e. won't VM-Exit in the first place.
>
> And that brings up another wrinkle.  The shadow VMCS bitmaps are global across
> all VMs, e.g. if the preemption timer is supported in hardware but hidden from
> L1, then a misbehaving L1 can VMREAD/VMWRITE the field even with this patch.
> If it was just the preemption timer we could consider disabling shadow VMCS for
> the VM ifthe timer exists but is hidden from L1, but GUEST_PML_INDEX and
> GUEST_INTR_STATUS are also conditional :-(
>
> Maybe there's a middle ground, e.g. let userspace tell KVM which fields it plans
> on exposing to L1, use that to build the bitmaps, and disable shadow VMCS if
> userspace creates VMs that don't match the specified configuration.  Burning
> three more pages per VM isn't very enticing...
>
> This is quite the complicated mess for something I'm guessing no one actually
> cares about.  At what point do we chalk this up as a virtualization hole and
> sweep it under the rug?

Good point! Note that hardware doesn't even get this right. See
erratum CF77 in
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e7-v2-spec-update.pdf.
I'd cut and paste the text here, but Intel won't allow that.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-18 23:45         ` Jim Mattson
@ 2021-08-18 23:49           ` Sean Christopherson
  0 siblings, 0 replies; 37+ messages in thread
From: Sean Christopherson @ 2021-08-18 23:49 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Wed, Aug 18, 2021, Jim Mattson wrote:
> On Wed, Aug 18, 2021 at 4:11 PM Sean Christopherson <seanjc@google.com> wrote:
> > This is quite the complicated mess for something I'm guessing no one actually
> > cares about.  At what point do we chalk this up as a virtualization hole and
> > sweep it under the rug?
> 
> Good point! Note that hardware doesn't even get this right. See
> erratum CF77 in
> https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e7-v2-spec-update.pdf.
> I'd cut and paste the text here, but Intel won't allow that.

Ha!  KVM's behavior is a feature, not a bug, we're just matching hardware! ;-)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-18 23:10       ` Sean Christopherson
  2021-08-18 23:45         ` Jim Mattson
@ 2021-08-19  9:58         ` Robert Hoo
  2021-09-01 20:42           ` Sean Christopherson
  1 sibling, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-08-19  9:58 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Wed, 2021-08-18 at 23:10 +0000, Sean Christopherson wrote:
> On Wed, Aug 18, 2021, Robert Hoo wrote:
> > > Limiting this to VMREAD/VMWRITE means we shouldn't need a bitmap
> > > and
> > > can use a more static lookup, e.g. a switch statement.  
> > 
> > Emm, hard for me to choose:
> > 
> > Your approach sounds more efficient for CPU: Once VMX MSR's
> > updated, no
> > bother to update the bitmap. Each field's existence check will
> > directly
> > consult related VMX MSR. Well, the switch statement will be long...
> 
> How long?  Honest question, off the top of my head I don't have a
> feel for how
> many fields conditionally exist.

Per my just manual count, ~51 fields till today.
> 
> > My this implementation: once VMX MSR's updated, the update needs to
> > be
> > passed to bitmap, this is 1 extra step comparing to aforementioned
> > above. But, later, when query field existence, especially the those
> > consulting vm{entry,exit}_ctrl, they usually would have to consult
> > both
> > MSRs if otherwise no bitmap, and we cannot guarantee if in the
> > future
> > there's no more complicated dependencies. If using bitmap, this
> > consult
> > is just 1-bit reading. If no bitmap, several MSR's read and compare
> > happen.
> 
> Yes, but the bitmap is per-VM and likely may or may not be cache-hot
> for back-to-back
> VMREAD/VMWRITE to different fields, whereas the shadow controls are
> much more likely
> to reside somewhere in the caches.

Sorry I don't quite understand the "shadow controls" here. Do you mean
shadow VMCS? what does field existence to do with shadow VMCS? emm,
here you indeed remind me a questions: what if L1 VMREAD/VMWRITE a
shadow field that doesn't exist?

If your here "shadow controls" means nested_vmx.nested_vmx_msrs,
they're like bitmap, per-vCPU, I think no essential difference for
their cache hit possibilities. BTW, till current VMCS12 size, the
bitmap can be contained in a cache line.
> 
> > And, VMX MSR --> bitmap, usually happens only once when vCPU model
> > is
> > settled. But VMRead/VMWrite might happen frequently, depends on
> > guest
> > itself. I'd rather leave complicated comparison in former than in
> > later.
> 
> I'm not terribly concerned about the runtime performance, it's the
> extra per-VM
> allocation for something that's not thaaat interesting that I don't
> like.

OK, it's even further, per-vCPU/vmx ;)
> 
> And for performance, most of the frequently accessed VMCS fields will
> be shadowed
> anyways, i.e. won't VM-Exit in the first place.
> 
> And that brings up another wrinkle.  The shadow VMCS bitmaps are
> global across
> all VMs, 
OK, that's the problem. Ideally, it should be per-VM or per-vCPU, but
that means each VM/vCPU will consume 2 more pages for vm{read,write}
bitmap.

> e.g. if the preemption timer is supported in hardware but hidden from
> L1, then a misbehaving L1 can VMREAD/VMWRITE the field even with this
> patch.
> If it was just the preemption timer we could consider disabling
> shadow VMCS for
> the VM ifthe timer exists but is hidden from L1, but GUEST_PML_INDEX
> and
> GUEST_INTR_STATUS are also conditional :-(

Yes, if the vm{read,write}-bitmap is KVM global, cannot implement field
existence with shadow VMCS functioning. I don't think it's right. It
just did't cause any trouble until we consider today's field existence
implementation.

If we stringently implement this per spec, i.e. each VMCS has its own
vm{read,write}-bitmap, or at least each VM has its own, then doable.
> 
> Maybe there's a middle ground, e.g. let userspace tell KVM which
> fields it plans
> on exposing to L1, use that to build the bitmaps, and disable shadow
> VMCS if
> userspace creates VMs that don't match the specified configuration.  

Here "specific configuration" means: if KVM vm{write,read}-bitmap
enables some L1 non-exist field shadow read/write, we turn of shadow
VMCS for that VM, right? I guess user would rather abandon this field
existence check for VMCS shadowing.


> Burning
> three more pages per VM isn't very enticing...

Why 3 more? I count 2 more pages, i.e. vm{read,write}-bitmap.
And, just 2 pages (8KB) per VM isn't huge consumption, is it? ;)

> 
> This is quite the complicated mess for something I'm guessing no one
> actually
> cares about.  At what point do we chalk this up as a virtualization
> hole and
> sweep it under the rug?
Yes, too complicated, beyond my imagination of vmcs12 field existence
implementation at first. I guess perhaps the original guy who hard
coded nested_msr.vmcs_enum had tried this before ;)



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-08-19  9:58         ` Robert Hoo
@ 2021-09-01 20:42           ` Sean Christopherson
  2021-09-03  8:51             ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-09-01 20:42 UTC (permalink / raw)
  To: Robert Hoo; +Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Thu, Aug 19, 2021, Robert Hoo wrote:
> On Wed, 2021-08-18 at 23:10 +0000, Sean Christopherson wrote:
> > > My this implementation: once VMX MSR's updated, the update needs to be
> > > passed to bitmap, this is 1 extra step comparing to aforementioned above.
> > > But, later, when query field existence, especially the those consulting
> > > vm{entry,exit}_ctrl, they usually would have to consult both MSRs if
> > > otherwise no bitmap, and we cannot guarantee if in the future there's no
> > > more complicated dependencies. If using bitmap, this consult is just
> > > 1-bit reading. If no bitmap, several MSR's read and compare happen.
> > 
> > Yes, but the bitmap is per-VM and likely may or may not be cache-hot for
> > back-to-back VMREAD/VMWRITE to different fields, whereas the shadow
> > controls are much more likely to reside somewhere in the caches.
> 
> Sorry I don't quite understand the "shadow controls" here. Do you mean
> shadow VMCS? what does field existence to do with shadow VMCS?

vmcs->controls_shadow.*

> emm, here you indeed remind me a questions: what if L1 VMREAD/VMWRITE a
> shadow field that doesn't exist?

Doesn't exist in hardware?  KVM will intercept the access by leaving the
corresponding bit set in the VMREAD/VMWRITE bitmaps.  This is handled by
init_vmcs_shadow_fields().  Note, KVM will still incorrectly emulate the access,
but on the plus side that means L2 will see consistent behavior regardless of
underlying hardware.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-09-01 20:42           ` Sean Christopherson
@ 2021-09-03  8:51             ` Robert Hoo
  2021-09-03 15:11               ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-09-03  8:51 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Wed, 2021-09-01 at 20:42 +0000, Sean Christopherson wrote:
> On Thu, Aug 19, 2021, Robert Hoo wrote:
> > On Wed, 2021-08-18 at 23:10 +0000, Sean Christopherson wrote:
> > > > My this implementation: once VMX MSR's updated, the update
> > > > needs to be
> > > > passed to bitmap, this is 1 extra step comparing to
> > > > aforementioned above.
> > > > But, later, when query field existence, especially the those
> > > > consulting
> > > > vm{entry,exit}_ctrl, they usually would have to consult both
> > > > MSRs if
> > > > otherwise no bitmap, and we cannot guarantee if in the future
> > > > there's no
> > > > more complicated dependencies. If using bitmap, this consult is
> > > > just
> > > > 1-bit reading. If no bitmap, several MSR's read and compare
> > > > happen.
> > > 
> > > Yes, but the bitmap is per-VM and likely may or may not be cache-
> > > hot for
> > > back-to-back VMREAD/VMWRITE to different fields, whereas the
> > > shadow
> > > controls are much more likely to reside somewhere in the caches.
> > 
> > Sorry I don't quite understand the "shadow controls" here. Do you
> > mean
> > shadow VMCS? what does field existence to do with shadow VMCS?
> 
> vmcs->controls_shadow.*

OK, I see now. But I still don't understand why is these shadow
controls related to field existence. They not in
handle_vm{read,write}() path. Would you shed more light? Thanks.
> 
> > emm, here you indeed remind me a questions: what if L1
> > VMREAD/VMWRITE a
> > shadow field that doesn't exist?
> 
> Doesn't exist in hardware?  
I mean doesn't exist in VMCS12, per the bitmap. In this case, when L1
read/write the field, it is shadowed, won't be trapped by vmx. 

> KVM will intercept the access by leaving the
> corresponding bit set in the VMREAD/VMWRITE bitmaps.  This is handled
> by
> init_vmcs_shadow_fields().  Note, KVM will still incorrectly emulate
> the access,
> but on the plus side that means L2 will see consistent behavior
> regardless of
> underlying hardware.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-09-03  8:51             ` Robert Hoo
@ 2021-09-03 15:11               ` Sean Christopherson
  2021-09-28 10:05                 ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-09-03 15:11 UTC (permalink / raw)
  To: Robert Hoo; +Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Fri, Sep 03, 2021, Robert Hoo wrote:
> On Wed, 2021-09-01 at 20:42 +0000, Sean Christopherson wrote:
> > On Thu, Aug 19, 2021, Robert Hoo wrote:
> > > On Wed, 2021-08-18 at 23:10 +0000, Sean Christopherson wrote:
> > > > > My this implementation: once VMX MSR's updated, the update needs to
> > > > > be passed to bitmap, this is 1 extra step comparing to aforementioned
> > > > > above.  But, later, when query field existence, especially the those
> > > > > consulting vm{entry,exit}_ctrl, they usually would have to consult
> > > > > both MSRs if otherwise no bitmap, and we cannot guarantee if in the
> > > > > future there's no more complicated dependencies. If using bitmap,
> > > > > this consult is just 1-bit reading. If no bitmap, several MSR's read
> > > > > and compare happen.
> > > > 
> > > > Yes, but the bitmap is per-VM and likely may or may not be cache- hot
> > > > for back-to-back VMREAD/VMWRITE to different fields, whereas the shadow
> > > > controls are much more likely to reside somewhere in the caches.
> > > 
> > > Sorry I don't quite understand the "shadow controls" here. Do you mean
> > > shadow VMCS? what does field existence to do with shadow VMCS?
> > 
> > vmcs->controls_shadow.*
> 
> OK, I see now. But I still don't understand why is these shadow
> controls related to field existence. They not in
> handle_vm{read,write}() path. Would you shed more light? Thanks.

Hmm, you're confused because my comment about the controls shadows is nonsensical.
I conflated the vmcs->controls_shadow.* with vmx->nested.msrs.*.  Sorry for the
confusion :-/

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-09-03 15:11               ` Sean Christopherson
@ 2021-09-28 10:05                 ` Robert Hoo
  2021-10-05 16:15                   ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-09-28 10:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> ...

Hi Sean,

Sorry for so late reply. Multi-task, you know;-)

The discussion about this patch has passed so long time and has
diverged, actually. Let me summarize our previous discussions. Then we
can converge things and settle direction.


* Copy to/from shadow vmcs, no need to validate field existence or not.
-- I agree.

* Now that only VMCS-read/write need to validate field existence, can
use static check instead of bitmap.
* And borrow bit 0 in the field->offset table to denote conditional
fields.

Because:
	Shadow control can have more chances to be cache-hit than
bitmap.
	The bitmap is per-VMX, additional memory allocation is not
interesting.
	
Robert argued:
	I still prefer to use bitmap to denote conditional fields.
	If used static switch…case check rather than bitmap, the
switch…case would be very long. Till today, ~51 conditional fields.
	Though very less likely, we cannot guarantee no future use of
bit 0 of field->offset table entry.
	From perspective of runtime efficiency, read bitmap is better
to do static check every time.
	From the perspective of cache hit chance, shadow control (or
nested_vmx_msrs) and bitmap are both in nested structure, I don't think
they have essential difference.
	The bitmap is just 62 bytes long now, I think it's tolerable.:)


*• Interaction with Shadow VMCS -- for those shadowed fields, we cannot
trap its read/write, therefore cannot check its existence per vmx
configuration L0 set for L1.
	
	This last point is the most messy one.

	If we would like to solve this, you proposed as a middle ground
to disable shadow VMCS totally when user space setting conflicts with
what KVM figured out.

	You also said, "This is quite the complicated mess for
something I'm guessing no one actually cares about.  At what point do
we chalk this up as a virtualization hole and sweep it under the rug?"
-- I couldn't agree more.

	We think to disable shadow VMCS totally is not good in any
circumstances, for the sake of nested performance, etc..
	We think there are 2 ways ahead:
	1) Leave it as it is nowadays, i.e. discard this patch set.
Perhaps we can add some build-check to force update that hard-coded
assignment to vmcs-enum-max when necessary.

	2) Make shadow vmcs bitmap per VM. This will have to allocate 2
(or 3?) more pages (shadow read/write bitmaps) per VM. Then we can
configure the shadow vmcs bitmap per user space configuration, i.e.
don't shadow those conditional VMCS fields, force its read/write to go
through handle_vm{read,write} gate.


So, Sean, can you help converge our discussion and settle next step?
Thanks.:-)



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-09-28 10:05                 ` Robert Hoo
@ 2021-10-05 16:15                   ` Sean Christopherson
  2021-10-05 17:32                     ` Jim Mattson
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-10-05 16:15 UTC (permalink / raw)
  To: Robert Hoo; +Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, kvm, yu.c.zhang

On Tue, Sep 28, 2021, Robert Hoo wrote:
> On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> 	You also said, "This is quite the complicated mess for
> something I'm guessing no one actually cares about.  At what point do
> we chalk this up as a virtualization hole and sweep it under the rug?"
> -- I couldn't agree more.

...
 
> So, Sean, can you help converge our discussion and settle next step?

Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
the proverbial rug?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-05 16:15                   ` Sean Christopherson
@ 2021-10-05 17:32                     ` Jim Mattson
  2021-10-05 17:59                       ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2021-10-05 17:32 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Sep 28, 2021, Robert Hoo wrote:
> > On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> >       You also said, "This is quite the complicated mess for
> > something I'm guessing no one actually cares about.  At what point do
> > we chalk this up as a virtualization hole and sweep it under the rug?"
> > -- I couldn't agree more.
>
> ...
>
> > So, Sean, can you help converge our discussion and settle next step?
>
> Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
> the proverbial rug?

Adding 8 KiB per vCPU seems like no big deal to me, but, on the other
hand, Paolo recently argued that slightly less than 1 KiB per vCPU was
unreasonable for VM-exit statistics, so maybe I've got a warped
perspective. I'm all for pedantic adherence to the specification, but
I have to admit that no actual hypervisor is likely to care (or ever
will).

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-05 17:32                     ` Jim Mattson
@ 2021-10-05 17:59                       ` Sean Christopherson
  2021-10-05 20:42                         ` Jim Mattson
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-10-05 17:59 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Tue, Oct 05, 2021, Jim Mattson wrote:
> On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> > >       You also said, "This is quite the complicated mess for
> > > something I'm guessing no one actually cares about.  At what point do
> > > we chalk this up as a virtualization hole and sweep it under the rug?"
> > > -- I couldn't agree more.
> >
> > ...
> >
> > > So, Sean, can you help converge our discussion and settle next step?
> >
> > Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
> > the proverbial rug?
> 
> Adding 8 KiB per vCPU seems like no big deal to me, but, on the other
> hand, Paolo recently argued that slightly less than 1 KiB per vCPU was
> unreasonable for VM-exit statistics, so maybe I've got a warped
> perspective. I'm all for pedantic adherence to the specification, but
> I have to admit that no actual hypervisor is likely to care (or ever
> will).

It's not just the memory, it's also the complexity, e.g. to get VMCS shadowing
working correctly, both now and in the future.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-05 17:59                       ` Sean Christopherson
@ 2021-10-05 20:42                         ` Jim Mattson
  2021-10-05 20:50                           ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2021-10-05 20:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Oct 05, 2021, Jim Mattson wrote:
> > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> > > >       You also said, "This is quite the complicated mess for
> > > > something I'm guessing no one actually cares about.  At what point do
> > > > we chalk this up as a virtualization hole and sweep it under the rug?"
> > > > -- I couldn't agree more.
> > >
> > > ...
> > >
> > > > So, Sean, can you help converge our discussion and settle next step?
> > >
> > > Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
> > > the proverbial rug?
> >
> > Adding 8 KiB per vCPU seems like no big deal to me, but, on the other
> > hand, Paolo recently argued that slightly less than 1 KiB per vCPU was
> > unreasonable for VM-exit statistics, so maybe I've got a warped
> > perspective. I'm all for pedantic adherence to the specification, but
> > I have to admit that no actual hypervisor is likely to care (or ever
> > will).
>
> It's not just the memory, it's also the complexity, e.g. to get VMCS shadowing
> working correctly, both now and in the future.

As far as CPU feature virtualization goes, this one doesn't seem that
complex to me. It's not anywhere near as complex as virtualizing MTF,
for instance, and KVM *claims* to do that! :-)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-05 20:42                         ` Jim Mattson
@ 2021-10-05 20:50                           ` Sean Christopherson
  2021-10-05 22:40                             ` Jim Mattson
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-10-05 20:50 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Tue, Oct 05, 2021, Jim Mattson wrote:
> On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote:
> > > >
> > > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > > On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> > > > >       You also said, "This is quite the complicated mess for
> > > > > something I'm guessing no one actually cares about.  At what point do
> > > > > we chalk this up as a virtualization hole and sweep it under the rug?"
> > > > > -- I couldn't agree more.
> > > >
> > > > ...
> > > >
> > > > > So, Sean, can you help converge our discussion and settle next step?
> > > >
> > > > Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
> > > > the proverbial rug?
> > >
> > > Adding 8 KiB per vCPU seems like no big deal to me, but, on the other
> > > hand, Paolo recently argued that slightly less than 1 KiB per vCPU was
> > > unreasonable for VM-exit statistics, so maybe I've got a warped
> > > perspective. I'm all for pedantic adherence to the specification, but
> > > I have to admit that no actual hypervisor is likely to care (or ever
> > > will).
> >
> > It's not just the memory, it's also the complexity, e.g. to get VMCS shadowing
> > working correctly, both now and in the future.
> 
> As far as CPU feature virtualization goes, this one doesn't seem that
> complex to me. It's not anywhere near as complex as virtualizing MTF,
> for instance, and KVM *claims* to do that! :-)

There aren't many things as complex as MTF.  But unlike MTF, this behavior doesn't
have a concrete use case to justify the risk vs. reward.  IMO the odds of us breaking
something in KVM for "normal" use cases are higher than the odds of an L1 VMM breaking
because a VMREAD/VMWRITE didn't fail when it technically should have failed.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-05 20:50                           ` Sean Christopherson
@ 2021-10-05 22:40                             ` Jim Mattson
  2021-10-05 23:22                               ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2021-10-05 22:40 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Tue, Oct 5, 2021 at 1:50 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Oct 05, 2021, Jim Mattson wrote:
> > On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > >
> > > > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > > > On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> > > > > >       You also said, "This is quite the complicated mess for
> > > > > > something I'm guessing no one actually cares about.  At what point do
> > > > > > we chalk this up as a virtualization hole and sweep it under the rug?"
> > > > > > -- I couldn't agree more.
> > > > >
> > > > > ...
> > > > >
> > > > > > So, Sean, can you help converge our discussion and settle next step?
> > > > >
> > > > > Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
> > > > > the proverbial rug?
> > > >
> > > > Adding 8 KiB per vCPU seems like no big deal to me, but, on the other
> > > > hand, Paolo recently argued that slightly less than 1 KiB per vCPU was
> > > > unreasonable for VM-exit statistics, so maybe I've got a warped
> > > > perspective. I'm all for pedantic adherence to the specification, but
> > > > I have to admit that no actual hypervisor is likely to care (or ever
> > > > will).
> > >
> > > It's not just the memory, it's also the complexity, e.g. to get VMCS shadowing
> > > working correctly, both now and in the future.
> >
> > As far as CPU feature virtualization goes, this one doesn't seem that
> > complex to me. It's not anywhere near as complex as virtualizing MTF,
> > for instance, and KVM *claims* to do that! :-)
>
> There aren't many things as complex as MTF.  But unlike MTF, this behavior doesn't
> have a concrete use case to justify the risk vs. reward.  IMO the odds of us breaking
> something in KVM for "normal" use cases are higher than the odds of an L1 VMM breaking
> because a VMREAD/VMWRITE didn't fail when it technically should have failed.

Playing devil's advocate here, because I totally agree with you...

Who's to say what's "normal"? It's a slippery slope when we start
making personal value judgments about which parts of the architectural
specification are important and which aren't.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-05 22:40                             ` Jim Mattson
@ 2021-10-05 23:22                               ` Sean Christopherson
  2021-10-08  8:23                                 ` Yu Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-10-05 23:22 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm, yu.c.zhang

On Tue, Oct 05, 2021, Jim Mattson wrote:
> On Tue, Oct 5, 2021 at 1:50 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <seanjc@google.com> wrote:
> > > >
> > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > > >
> > > > > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > > > > On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> > > > > > >       You also said, "This is quite the complicated mess for
> > > > > > > something I'm guessing no one actually cares about.  At what point do
> > > > > > > we chalk this up as a virtualization hole and sweep it under the rug?"
> > > > > > > -- I couldn't agree more.
> > > > > >
> > > > > > ...
> > > > > >
> > > > > > > So, Sean, can you help converge our discussion and settle next step?
> > > > > >
> > > > > > Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
> > > > > > the proverbial rug?
> > > > >
> > > > > Adding 8 KiB per vCPU seems like no big deal to me, but, on the other
> > > > > hand, Paolo recently argued that slightly less than 1 KiB per vCPU was
> > > > > unreasonable for VM-exit statistics, so maybe I've got a warped
> > > > > perspective. I'm all for pedantic adherence to the specification, but
> > > > > I have to admit that no actual hypervisor is likely to care (or ever
> > > > > will).
> > > >
> > > > It's not just the memory, it's also the complexity, e.g. to get VMCS shadowing
> > > > working correctly, both now and in the future.
> > >
> > > As far as CPU feature virtualization goes, this one doesn't seem that
> > > complex to me. It's not anywhere near as complex as virtualizing MTF,
> > > for instance, and KVM *claims* to do that! :-)
> >
> > There aren't many things as complex as MTF.  But unlike MTF, this behavior doesn't
> > have a concrete use case to justify the risk vs. reward.  IMO the odds of us breaking
> > something in KVM for "normal" use cases are higher than the odds of an L1 VMM breaking
> > because a VMREAD/VMWRITE didn't fail when it technically should have failed.
> 
> Playing devil's advocate here, because I totally agree with you...
> 
> Who's to say what's "normal"? It's a slippery slope when we start
> making personal value judgments about which parts of the architectural
> specification are important and which aren't.

I agree, but in a very similar case Intel chose to take an erratum instead of
fixing what was in all likelihood a microcode bug, i.e. could have been patched
in the field.  So it's not _just_ personal value judgment, though it's definitely
that too :-)

I'm not saying I'd actively oppose support for strict VMREAD/VMWRITE adherence
to the vCPU model, but I'm also not going to advise anyone to go spend their time
implementing a non-trivial fix for behavior that, AFAIK, doesn't adversely affect
any real world use cases.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-05 23:22                               ` Sean Christopherson
@ 2021-10-08  8:23                                 ` Yu Zhang
  2021-10-08 15:09                                   ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Yu Zhang @ 2021-10-08  8:23 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: Robert Hoo, pbonzini, vkuznets, wanpengli, joro, kvm

On Tue, Oct 05, 2021 at 11:22:15PM +0000, Sean Christopherson wrote:
> On Tue, Oct 05, 2021, Jim Mattson wrote:
> > On Tue, Oct 5, 2021 at 1:50 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > >
> > > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > > > >
> > > > > > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > > > > > On Fri, 2021-09-03 at 15:11 +0000, Sean Christopherson wrote:
> > > > > > > >       You also said, "This is quite the complicated mess for
> > > > > > > > something I'm guessing no one actually cares about.  At what point do
> > > > > > > > we chalk this up as a virtualization hole and sweep it under the rug?"
> > > > > > > > -- I couldn't agree more.
> > > > > > >
> > > > > > > ...
> > > > > > >
> > > > > > > > So, Sean, can you help converge our discussion and settle next step?
> > > > > > >
> > > > > > > Any objection to simply keeping KVM's current behavior, i.e. sweeping this under
> > > > > > > the proverbial rug?
> > > > > >
> > > > > > Adding 8 KiB per vCPU seems like no big deal to me, but, on the other
> > > > > > hand, Paolo recently argued that slightly less than 1 KiB per vCPU was
> > > > > > unreasonable for VM-exit statistics, so maybe I've got a warped
> > > > > > perspective. I'm all for pedantic adherence to the specification, but
> > > > > > I have to admit that no actual hypervisor is likely to care (or ever
> > > > > > will).
> > > > >
> > > > > It's not just the memory, it's also the complexity, e.g. to get VMCS shadowing
> > > > > working correctly, both now and in the future.
> > > >
> > > > As far as CPU feature virtualization goes, this one doesn't seem that
> > > > complex to me. It's not anywhere near as complex as virtualizing MTF,
> > > > for instance, and KVM *claims* to do that! :-)
> > >
> > > There aren't many things as complex as MTF.  But unlike MTF, this behavior doesn't
> > > have a concrete use case to justify the risk vs. reward.  IMO the odds of us breaking
> > > something in KVM for "normal" use cases are higher than the odds of an L1 VMM breaking
> > > because a VMREAD/VMWRITE didn't fail when it technically should have failed.
> > 
> > Playing devil's advocate here, because I totally agree with you...
> > 
> > Who's to say what's "normal"? It's a slippery slope when we start
> > making personal value judgments about which parts of the architectural
> > specification are important and which aren't.
> 
> I agree, but in a very similar case Intel chose to take an erratum instead of
> fixing what was in all likelihood a microcode bug, i.e. could have been patched
> in the field.  So it's not _just_ personal value judgment, though it's definitely
> that too :-)
> 
> I'm not saying I'd actively oppose support for strict VMREAD/VMWRITE adherence
> to the vCPU model, but I'm also not going to advise anyone to go spend their time
> implementing a non-trivial fix for behavior that, AFAIK, doesn't adversely affect
> any real world use cases.
> 

Thank you all for the discussion, Sean & Jim.

Could we draw a conclusion to just keep KVM as it is now? If yes, how about we
depricate the check against max index value from MSR_IA32_VMX_VMCS_ENUM in vmx.c 
of the kvm-unit-test?

After all, we have not witnessed any real system doing so.

E.g.,

diff --git a/x86/vmx.c b/x86/vmx.c
index f0b853a..63623e5 100644
--- a/x86/vmx.c
+++ b/x86/vmx.c
@@ -380,8 +380,7 @@ static void test_vmwrite_vmread(void)
        vmcs_enum_max = (rdmsr(MSR_IA32_VMX_VMCS_ENUM) & VMCS_FIELD_INDEX_MASK)
                        >> VMCS_FIELD_INDEX_SHIFT;
        max_index = find_vmcs_max_index();
-       report(vmcs_enum_max == max_index,
-              "VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual: %x",
+       printf("VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual: %x",
               max_index, vmcs_enum_max);

        assert(!vmcs_clear(vmcs));

B.R.
Yu

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-08  8:23                                 ` Yu Zhang
@ 2021-10-08 15:09                                   ` Robert Hoo
  2021-10-08 23:49                                     ` Jim Mattson
  0 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-10-08 15:09 UTC (permalink / raw)
  To: Yu Zhang, Sean Christopherson, Jim Mattson
  Cc: pbonzini, vkuznets, wanpengli, joro, kvm

On Fri, 2021-10-08 at 16:23 +0800, Yu Zhang wrote:
> On Tue, Oct 05, 2021 at 11:22:15PM +0000, Sean Christopherson wrote:
> > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > On Tue, Oct 5, 2021 at 1:50 PM Sean Christopherson <
> > > seanjc@google.com> wrote:
> > > > 
> > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <
> > > > > seanjc@google.com> wrote:
> > > > > > 
> > > > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > > > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <
> > > > > > > seanjc@google.com> wrote:
> > > > > > > > 
> > > > > > > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > > > > > > On Fri, 2021-09-03 at 15:11 +0000, Sean
> > > > > > > > > Christopherson wrote:
> > > > > > > > >       You also said, "This is quite the complicated
> > > > > > > > > mess for
> > > > > > > > > something I'm guessing no one actually cares
> > > > > > > > > about.  At what point do
> > > > > > > > > we chalk this up as a virtualization hole and sweep
> > > > > > > > > it under the rug?"
> > > > > > > > > -- I couldn't agree more.
> > > > > > > > 
> > > > > > > > ...
> > > > > > > > 
> > > > > > > > > So, Sean, can you help converge our discussion and
> > > > > > > > > settle next step?
> > > > > > > > 
> > > > > > > > Any objection to simply keeping KVM's current behavior,
> > > > > > > > i.e. sweeping this under
> > > > > > > > the proverbial rug?
> > > > > > > 
> > > > > > > Adding 8 KiB per vCPU seems like no big deal to me, but,
> > > > > > > on the other
> > > > > > > hand, Paolo recently argued that slightly less than 1 KiB
> > > > > > > per vCPU was
> > > > > > > unreasonable for VM-exit statistics, so maybe I've got a
> > > > > > > warped
> > > > > > > perspective. I'm all for pedantic adherence to the
> > > > > > > specification, but
> > > > > > > I have to admit that no actual hypervisor is likely to
> > > > > > > care (or ever
> > > > > > > will).
> > > > > > 
> > > > > > It's not just the memory, it's also the complexity, e.g. to
> > > > > > get VMCS shadowing
> > > > > > working correctly, both now and in the future.
> > > > > 
> > > > > As far as CPU feature virtualization goes, this one doesn't
> > > > > seem that
> > > > > complex to me. It's not anywhere near as complex as
> > > > > virtualizing MTF,
> > > > > for instance, and KVM *claims* to do that! :-)
> > > > 
> > > > There aren't many things as complex as MTF.  But unlike MTF,
> > > > this behavior doesn't
> > > > have a concrete use case to justify the risk vs. reward.  IMO
> > > > the odds of us breaking
> > > > something in KVM for "normal" use cases are higher than the
> > > > odds of an L1 VMM breaking
> > > > because a VMREAD/VMWRITE didn't fail when it technically should
> > > > have failed.
> > > 
> > > Playing devil's advocate here, because I totally agree with
> > > you...
> > > 
> > > Who's to say what's "normal"? It's a slippery slope when we start
> > > making personal value judgments about which parts of the
> > > architectural
> > > specification are important and which aren't.
> > 
> > I agree, but in a very similar case Intel chose to take an erratum
> > instead of
> > fixing what was in all likelihood a microcode bug, i.e. could have
> > been patched
> > in the field.  So it's not _just_ personal value judgment, though
> > it's definitely
> > that too :-)
> > 
> > I'm not saying I'd actively oppose support for strict
> > VMREAD/VMWRITE adherence
> > to the vCPU model, but I'm also not going to advise anyone to go
> > spend their time
> > implementing a non-trivial fix for behavior that, AFAIK, doesn't
> > adversely affect
> > any real world use cases.
> > 
> 
> Thank you all for the discussion, Sean & Jim.
> 
> Could we draw a conclusion to just keep KVM as it is now? If yes, how
> about we
> depricate the check against max index value from
> MSR_IA32_VMX_VMCS_ENUM in vmx.c 
> of the kvm-unit-test?
> 
> After all, we have not witnessed any real system doing so.
> 
> E.g.,
> 
> diff --git a/x86/vmx.c b/x86/vmx.c
> index f0b853a..63623e5 100644
> --- a/x86/vmx.c
> +++ b/x86/vmx.c
> @@ -380,8 +380,7 @@ static void test_vmwrite_vmread(void)
>         vmcs_enum_max = (rdmsr(MSR_IA32_VMX_VMCS_ENUM) &
> VMCS_FIELD_INDEX_MASK)
>                         >> VMCS_FIELD_INDEX_SHIFT;
>         max_index = find_vmcs_max_index();
> -       report(vmcs_enum_max == max_index,
> -              "VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual: %x",
> +       printf("VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual: %x",
>                max_index, vmcs_enum_max);
> 
>         assert(!vmcs_clear(vmcs));
> 
> B.R.
> Yu

I think this patch series has its value of fixing the be-forced hard-
code VMX_VMCS_ENUM.
My understanding of Sean's "simply keeping KVM's current behavior, i.e.
sweeping this under the proverbial rug", is about vmcs shadowing will
fail some VMCS field validation. Of course, this in turn will fail some
case of this KVM unit test case (theoretically), though we haven't met
yet.



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-08 15:09                                   ` Robert Hoo
@ 2021-10-08 23:49                                     ` Jim Mattson
  2021-10-09  0:05                                       ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2021-10-08 23:49 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Yu Zhang, Sean Christopherson, pbonzini, vkuznets, wanpengli, joro, kvm

We have some internal patches for virtualizing VMCS shadowing which
may break if there is a guest VMCS field with index greater than
VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.

On Fri, Oct 8, 2021 at 8:09 AM Robert Hoo <robert.hu@linux.intel.com> wrote:
>
> On Fri, 2021-10-08 at 16:23 +0800, Yu Zhang wrote:
> > On Tue, Oct 05, 2021 at 11:22:15PM +0000, Sean Christopherson wrote:
> > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > On Tue, Oct 5, 2021 at 1:50 PM Sean Christopherson <
> > > > seanjc@google.com> wrote:
> > > > >
> > > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > > On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <
> > > > > > seanjc@google.com> wrote:
> > > > > > >
> > > > > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > > > > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <
> > > > > > > > seanjc@google.com> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > > > > > > > On Fri, 2021-09-03 at 15:11 +0000, Sean
> > > > > > > > > > Christopherson wrote:
> > > > > > > > > >       You also said, "This is quite the complicated
> > > > > > > > > > mess for
> > > > > > > > > > something I'm guessing no one actually cares
> > > > > > > > > > about.  At what point do
> > > > > > > > > > we chalk this up as a virtualization hole and sweep
> > > > > > > > > > it under the rug?"
> > > > > > > > > > -- I couldn't agree more.
> > > > > > > > >
> > > > > > > > > ...
> > > > > > > > >
> > > > > > > > > > So, Sean, can you help converge our discussion and
> > > > > > > > > > settle next step?
> > > > > > > > >
> > > > > > > > > Any objection to simply keeping KVM's current behavior,
> > > > > > > > > i.e. sweeping this under
> > > > > > > > > the proverbial rug?
> > > > > > > >
> > > > > > > > Adding 8 KiB per vCPU seems like no big deal to me, but,
> > > > > > > > on the other
> > > > > > > > hand, Paolo recently argued that slightly less than 1 KiB
> > > > > > > > per vCPU was
> > > > > > > > unreasonable for VM-exit statistics, so maybe I've got a
> > > > > > > > warped
> > > > > > > > perspective. I'm all for pedantic adherence to the
> > > > > > > > specification, but
> > > > > > > > I have to admit that no actual hypervisor is likely to
> > > > > > > > care (or ever
> > > > > > > > will).
> > > > > > >
> > > > > > > It's not just the memory, it's also the complexity, e.g. to
> > > > > > > get VMCS shadowing
> > > > > > > working correctly, both now and in the future.
> > > > > >
> > > > > > As far as CPU feature virtualization goes, this one doesn't
> > > > > > seem that
> > > > > > complex to me. It's not anywhere near as complex as
> > > > > > virtualizing MTF,
> > > > > > for instance, and KVM *claims* to do that! :-)
> > > > >
> > > > > There aren't many things as complex as MTF.  But unlike MTF,
> > > > > this behavior doesn't
> > > > > have a concrete use case to justify the risk vs. reward.  IMO
> > > > > the odds of us breaking
> > > > > something in KVM for "normal" use cases are higher than the
> > > > > odds of an L1 VMM breaking
> > > > > because a VMREAD/VMWRITE didn't fail when it technically should
> > > > > have failed.
> > > >
> > > > Playing devil's advocate here, because I totally agree with
> > > > you...
> > > >
> > > > Who's to say what's "normal"? It's a slippery slope when we start
> > > > making personal value judgments about which parts of the
> > > > architectural
> > > > specification are important and which aren't.
> > >
> > > I agree, but in a very similar case Intel chose to take an erratum
> > > instead of
> > > fixing what was in all likelihood a microcode bug, i.e. could have
> > > been patched
> > > in the field.  So it's not _just_ personal value judgment, though
> > > it's definitely
> > > that too :-)
> > >
> > > I'm not saying I'd actively oppose support for strict
> > > VMREAD/VMWRITE adherence
> > > to the vCPU model, but I'm also not going to advise anyone to go
> > > spend their time
> > > implementing a non-trivial fix for behavior that, AFAIK, doesn't
> > > adversely affect
> > > any real world use cases.
> > >
> >
> > Thank you all for the discussion, Sean & Jim.
> >
> > Could we draw a conclusion to just keep KVM as it is now? If yes, how
> > about we
> > depricate the check against max index value from
> > MSR_IA32_VMX_VMCS_ENUM in vmx.c
> > of the kvm-unit-test?
> >
> > After all, we have not witnessed any real system doing so.
> >
> > E.g.,
> >
> > diff --git a/x86/vmx.c b/x86/vmx.c
> > index f0b853a..63623e5 100644
> > --- a/x86/vmx.c
> > +++ b/x86/vmx.c
> > @@ -380,8 +380,7 @@ static void test_vmwrite_vmread(void)
> >         vmcs_enum_max = (rdmsr(MSR_IA32_VMX_VMCS_ENUM) &
> > VMCS_FIELD_INDEX_MASK)
> >                         >> VMCS_FIELD_INDEX_SHIFT;
> >         max_index = find_vmcs_max_index();
> > -       report(vmcs_enum_max == max_index,
> > -              "VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual: %x",
> > +       printf("VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual: %x",
> >                max_index, vmcs_enum_max);
> >
> >         assert(!vmcs_clear(vmcs));
> >
> > B.R.
> > Yu
>
> I think this patch series has its value of fixing the be-forced hard-
> code VMX_VMCS_ENUM.
> My understanding of Sean's "simply keeping KVM's current behavior, i.e.
> sweeping this under the proverbial rug", is about vmcs shadowing will
> fail some VMCS field validation. Of course, this in turn will fail some
> case of this KVM unit test case (theoretically), though we haven't met
> yet.
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-08 23:49                                     ` Jim Mattson
@ 2021-10-09  0:05                                       ` Robert Hoo
  2021-10-29 19:53                                         ` Jim Mattson
  0 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-10-09  0:05 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yu Zhang, Sean Christopherson, pbonzini, vkuznets, wanpengli, joro, kvm

On Fri, 2021-10-08 at 16:49 -0700, Jim Mattson wrote:
> We have some internal patches for virtualizing VMCS shadowing which
> may break if there is a guest VMCS field with index greater than
> VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.

OK, thanks for letting us know.:-)
> 
> On Fri, Oct 8, 2021 at 8:09 AM Robert Hoo <robert.hu@linux.intel.com>
> wrote:
> > 
> > On Fri, 2021-10-08 at 16:23 +0800, Yu Zhang wrote:
> > > On Tue, Oct 05, 2021 at 11:22:15PM +0000, Sean Christopherson
> > > wrote:
> > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > On Tue, Oct 5, 2021 at 1:50 PM Sean Christopherson <
> > > > > seanjc@google.com> wrote:
> > > > > > 
> > > > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > > > On Tue, Oct 5, 2021 at 10:59 AM Sean Christopherson <
> > > > > > > seanjc@google.com> wrote:
> > > > > > > > 
> > > > > > > > On Tue, Oct 05, 2021, Jim Mattson wrote:
> > > > > > > > > On Tue, Oct 5, 2021 at 9:16 AM Sean Christopherson <
> > > > > > > > > seanjc@google.com> wrote:
> > > > > > > > > > 
> > > > > > > > > > On Tue, Sep 28, 2021, Robert Hoo wrote:
> > > > > > > > > > > On Fri, 2021-09-03 at 15:11 +0000, Sean
> > > > > > > > > > > Christopherson wrote:
> > > > > > > > > > >       You also said, "This is quite the
> > > > > > > > > > > complicated
> > > > > > > > > > > mess for
> > > > > > > > > > > something I'm guessing no one actually cares
> > > > > > > > > > > about.  At what point do
> > > > > > > > > > > we chalk this up as a virtualization hole and
> > > > > > > > > > > sweep
> > > > > > > > > > > it under the rug?"
> > > > > > > > > > > -- I couldn't agree more.
> > > > > > > > > > 
> > > > > > > > > > ...
> > > > > > > > > > 
> > > > > > > > > > > So, Sean, can you help converge our discussion
> > > > > > > > > > > and
> > > > > > > > > > > settle next step?
> > > > > > > > > > 
> > > > > > > > > > Any objection to simply keeping KVM's current
> > > > > > > > > > behavior,
> > > > > > > > > > i.e. sweeping this under
> > > > > > > > > > the proverbial rug?
> > > > > > > > > 
> > > > > > > > > Adding 8 KiB per vCPU seems like no big deal to me,
> > > > > > > > > but,
> > > > > > > > > on the other
> > > > > > > > > hand, Paolo recently argued that slightly less than 1
> > > > > > > > > KiB
> > > > > > > > > per vCPU was
> > > > > > > > > unreasonable for VM-exit statistics, so maybe I've
> > > > > > > > > got a
> > > > > > > > > warped
> > > > > > > > > perspective. I'm all for pedantic adherence to the
> > > > > > > > > specification, but
> > > > > > > > > I have to admit that no actual hypervisor is likely
> > > > > > > > > to
> > > > > > > > > care (or ever
> > > > > > > > > will).
> > > > > > > > 
> > > > > > > > It's not just the memory, it's also the complexity,
> > > > > > > > e.g. to
> > > > > > > > get VMCS shadowing
> > > > > > > > working correctly, both now and in the future.
> > > > > > > 
> > > > > > > As far as CPU feature virtualization goes, this one
> > > > > > > doesn't
> > > > > > > seem that
> > > > > > > complex to me. It's not anywhere near as complex as
> > > > > > > virtualizing MTF,
> > > > > > > for instance, and KVM *claims* to do that! :-)
> > > > > > 
> > > > > > There aren't many things as complex as MTF.  But unlike
> > > > > > MTF,
> > > > > > this behavior doesn't
> > > > > > have a concrete use case to justify the risk vs.
> > > > > > reward.  IMO
> > > > > > the odds of us breaking
> > > > > > something in KVM for "normal" use cases are higher than the
> > > > > > odds of an L1 VMM breaking
> > > > > > because a VMREAD/VMWRITE didn't fail when it technically
> > > > > > should
> > > > > > have failed.
> > > > > 
> > > > > Playing devil's advocate here, because I totally agree with
> > > > > you...
> > > > > 
> > > > > Who's to say what's "normal"? It's a slippery slope when we
> > > > > start
> > > > > making personal value judgments about which parts of the
> > > > > architectural
> > > > > specification are important and which aren't.
> > > > 
> > > > I agree, but in a very similar case Intel chose to take an
> > > > erratum
> > > > instead of
> > > > fixing what was in all likelihood a microcode bug, i.e. could
> > > > have
> > > > been patched
> > > > in the field.  So it's not _just_ personal value judgment,
> > > > though
> > > > it's definitely
> > > > that too :-)
> > > > 
> > > > I'm not saying I'd actively oppose support for strict
> > > > VMREAD/VMWRITE adherence
> > > > to the vCPU model, but I'm also not going to advise anyone to
> > > > go
> > > > spend their time
> > > > implementing a non-trivial fix for behavior that, AFAIK,
> > > > doesn't
> > > > adversely affect
> > > > any real world use cases.
> > > > 
> > > 
> > > Thank you all for the discussion, Sean & Jim.
> > > 
> > > Could we draw a conclusion to just keep KVM as it is now? If yes,
> > > how
> > > about we
> > > depricate the check against max index value from
> > > MSR_IA32_VMX_VMCS_ENUM in vmx.c
> > > of the kvm-unit-test?
> > > 
> > > After all, we have not witnessed any real system doing so.
> > > 
> > > E.g.,
> > > 
> > > diff --git a/x86/vmx.c b/x86/vmx.c
> > > index f0b853a..63623e5 100644
> > > --- a/x86/vmx.c
> > > +++ b/x86/vmx.c
> > > @@ -380,8 +380,7 @@ static void test_vmwrite_vmread(void)
> > >         vmcs_enum_max = (rdmsr(MSR_IA32_VMX_VMCS_ENUM) &
> > > VMCS_FIELD_INDEX_MASK)
> > >                         >> VMCS_FIELD_INDEX_SHIFT;
> > >         max_index = find_vmcs_max_index();
> > > -       report(vmcs_enum_max == max_index,
> > > -              "VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual:
> > > %x",
> > > +       printf("VMX_VMCS_ENUM.MAX_INDEX expected: %x, actual:
> > > %x",
> > >                max_index, vmcs_enum_max);
> > > 
> > >         assert(!vmcs_clear(vmcs));
> > > 
> > > B.R.
> > > Yu
> > 
> > I think this patch series has its value of fixing the be-forced
> > hard-
> > code VMX_VMCS_ENUM.
> > My understanding of Sean's "simply keeping KVM's current behavior,
> > i.e.
> > sweeping this under the proverbial rug", is about vmcs shadowing
> > will
> > fail some VMCS field validation. Of course, this in turn will fail
> > some
> > case of this KVM unit test case (theoretically), though we haven't
> > met
> > yet.
> > 
> > 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx
  2021-08-17  9:31 ` [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx Robert Hoo
@ 2021-10-20 15:10   ` Paolo Bonzini
  2021-10-21 12:41     ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2021-10-20 15:10 UTC (permalink / raw)
  To: Robert Hoo, seanjc, vkuznets, wanpengli, jmattson, joro; +Cc: kvm, yu.c.zhang

On 17/08/21 11:31, Robert Hoo wrote:
> +#define FIELD_BIT_SET(name, bitmap) set_bit(f_pos(name), bitmap)
> +#define FIELD64_BIT_SET(name, bitmap)	\
> +	do {set_bit(f_pos(name), bitmap);	\
> +	    set_bit(f_pos(name) + (sizeof(u32) / sizeof(u16)), bitmap);\
> +	} while (0)
> +
> +#define FIELD_BIT_CLEAR(name, bitmap) clear_bit(f_pos(name), bitmap)
> +#define FIELD64_BIT_CLEAR(name, bitmap)	\
> +	do {clear_bit(f_pos(name), bitmap);	\
> +	    clear_bit(f_pos(name) + (sizeof(u32) / sizeof(u16)), bitmap);\
> +	} while (0)
> +
> +#define FIELD_BIT_CHANGE(name, bitmap) change_bit(f_pos(name), bitmap)
> +#define FIELD64_BIT_CHANGE(name, bitmap)	\
> +	do {change_bit(f_pos(name), bitmap);	\
> +	    change_bit(f_pos(name) + (sizeof(u32) / sizeof(u16)), bitmap);\
> +	} while (0)
> +
> +/*

Hi Robert,

I'd rather not have FIELD_BIT_CHANGE, and instead have something like

#define FIELD_BIT_ASSIGN(name, bitmap, value) \
	if (value) \
		FIELD_BIT_SET(name, bitmap); \
	else
		FIELD_BIT_CLEAR(name, bitmap);

Also, these set_bit/clear_bit can use the non-atomic variants __set_bit 
and __clear_bit, because the bitmaps are protected by the vCPU mutex.

> +		FIELD64_BIT_CHANGE(posted_intr_desc_addr, bitmap);

Many of the fields you mark as 64-bit are actually natural sized.

> +	if ((old_val ^ new_val) &
> +	    CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
> +		FIELD_BIT_CHANGE(secondary_vm_exec_control, bitmap);
> +	}
> +}

If secondary controls are not available, you should treat the 
corresponding MSR as if it was all zeroes.  Likewise if VMFUNC is disabled.

> +	if ((old_val ^ new_val) & SECONDARY_EXEC_PAUSE_LOOP_EXITING) {
> +		FIELD64_BIT_CHANGE(vmread_bitmap, bitmap);
> +		FIELD64_BIT_CHANGE(vmwrite_bitmap, bitmap);

This seems wrong.

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 2/5] KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set
  2021-08-17  9:31 ` [PATCH v1 2/5] KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set Robert Hoo
@ 2021-10-20 15:11   ` Paolo Bonzini
  2021-10-21 13:08     ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2021-10-20 15:11 UTC (permalink / raw)
  To: Robert Hoo, seanjc, vkuznets, wanpengli, jmattson, joro; +Cc: kvm, yu.c.zhang

On 17/08/21 11:31, Robert Hoo wrote:
> +		vmcs12_field_update_by_vmexit_ctrl(vmx->nested.msrs.entry_ctls_high,
> +				*highp, data >> 32,
> +				vmx->nested.vmcs12_field_existence_bitmap);
> +		break;
> +	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
> +		vmcs12_field_update_by_vmentry_ctrl(vmx->nested.msrs.exit_ctls_high,
> +				*highp, data >> 32,
> +				vmx->nested.vmcs12_field_existence_bitmap);

These two functions maybe could be merged into just one, since there are 
going to be duplicate checks.

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx
  2021-10-20 15:10   ` Paolo Bonzini
@ 2021-10-21 12:41     ` Robert Hoo
  0 siblings, 0 replies; 37+ messages in thread
From: Robert Hoo @ 2021-10-21 12:41 UTC (permalink / raw)
  To: Paolo Bonzini, seanjc, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang

On Wed, 2021-10-20 at 17:10 +0200, Paolo Bonzini wrote:
> > +#define FIELD_BIT_CHANGE(name, bitmap) change_bit(f_pos(name),
> > bitmap)
> > +#define FIELD64_BIT_CHANGE(name, bitmap)	\
> > +	do {change_bit(f_pos(name), bitmap);	\
> > +	    change_bit(f_pos(name) + (sizeof(u32) / sizeof(u16)),
> > bitmap);\
> > +	} while (0)
> > +
> > +/*
> 
> Hi Robert,
> 
> I'd rather not have FIELD_BIT_CHANGE, and instead have something like
> 
> #define FIELD_BIT_ASSIGN(name, bitmap, value) \
> 	if (value) \
> 		FIELD_BIT_SET(name, bitmap); \
> 	else
> 		FIELD_BIT_CLEAR(name, bitmap);
> 
Sure. I had also hesitated if it could be assert that no
accident/unknown flipping on those bits by some other routine.

> Also, these set_bit/clear_bit can use the non-atomic variants
> __set_bit 
> and __clear_bit, because the bitmaps are protected by the vCPU mutex.

OK, thanks.

> 
> > +		FIELD64_BIT_CHANGE(posted_intr_desc_addr, bitmap);
> 
> Many of the fields you mark as 64-bit are actually natural sized.

Could you point me some example? I just compared with those natural
width fields in struct vmcs12{}, I don't find mistakes.

e.g. above line, "posted_intr_desc_addr" is indeed 64 bit field per SDM
vol.3 appendix B.
 
> 
> > +	if ((old_val ^ new_val) &
> > +	    CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
> > +		FIELD_BIT_CHANGE(secondary_vm_exec_control, bitmap);
> > +	}
> > +}
> 
> If secondary controls are not available, you should treat the 
> corresponding MSR as if it was all zeroes.  Likewise if VMFUNC is
> disabled.

OK. So you mean, if secondary_vm_exec_control bit cleared, I shall set
vmx->nested.msrs.secondary_ctls_{low,high} = 0, right?

> 
> > +	if ((old_val ^ new_val) & SECONDARY_EXEC_PAUSE_LOOP_EXITING) {
> > +		FIELD64_BIT_CHANGE(vmread_bitmap, bitmap);
> > +		FIELD64_BIT_CHANGE(vmwrite_bitmap, bitmap);
> 
> This seems wrong.

You're right. Here shall be bit changes of PLE_{GAP,WINDOW}, but since
they're not defined in vmcs12{} yet, I'm going to remove these lines.
> 
> Paolo
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 2/5] KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set
  2021-10-20 15:11   ` Paolo Bonzini
@ 2021-10-21 13:08     ` Robert Hoo
  0 siblings, 0 replies; 37+ messages in thread
From: Robert Hoo @ 2021-10-21 13:08 UTC (permalink / raw)
  To: Paolo Bonzini, seanjc, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, yu.c.zhang

On Wed, 2021-10-20 at 17:11 +0200, Paolo Bonzini wrote:
> On 17/08/21 11:31, Robert Hoo wrote:
> > +		vmcs12_field_update_by_vmexit_ctrl(vmx-
> > >nested.msrs.entry_ctls_high,
> > +				*highp, data >> 32,
> > +				vmx-
> > >nested.vmcs12_field_existence_bitmap);
> > +		break;
> > +	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
> > +		vmcs12_field_update_by_vmentry_ctrl(vmx-
> > >nested.msrs.exit_ctls_high,
> > +				*highp, data >> 32,
> > +				vmx-
> > >nested.vmcs12_field_existence_bitmap);
> 
> These two functions maybe could be merged into just one, since there
> are 
> going to be duplicate checks.

Can I keep them? I think this is trivial, and separating them looks
more clear, from logical perspective.:-)

A summary question:
am I going to send v2? since I'm not sure about Sean and Jim's decision
on whether to implement the interaction with shadow VMCS (which will
have to consume 8KiB more memory for each vmx).
And, Jim mentioned they have some virtualizing shadow vmcs patches
which is going to be sent to community. Should I wait for their
patches?
> 
> Paolo
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-09  0:05                                       ` Robert Hoo
@ 2021-10-29 19:53                                         ` Jim Mattson
  2021-11-03  1:31                                           ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2021-10-29 19:53 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Yu Zhang, Sean Christopherson, pbonzini, vkuznets, wanpengli, joro, kvm

On Fri, Oct 8, 2021 at 5:05 PM Robert Hoo <robert.hu@linux.intel.com> wrote:
>
> On Fri, 2021-10-08 at 16:49 -0700, Jim Mattson wrote:
> > We have some internal patches for virtualizing VMCS shadowing which
> > may break if there is a guest VMCS field with index greater than
> > VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.
>
> OK, thanks for letting us know.:-)

After careful consideration, we're actually going to drop these
patches rather than sending them upstream.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-10-29 19:53                                         ` Jim Mattson
@ 2021-11-03  1:31                                           ` Robert Hoo
  2021-11-09 22:33                                             ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Robert Hoo @ 2021-11-03  1:31 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yu Zhang, Sean Christopherson, pbonzini, vkuznets, wanpengli, joro, kvm

On Fri, 2021-10-29 at 12:53 -0700, Jim Mattson wrote:
> On Fri, Oct 8, 2021 at 5:05 PM Robert Hoo <robert.hu@linux.intel.com>
> wrote:
> > 
> > On Fri, 2021-10-08 at 16:49 -0700, Jim Mattson wrote:
> > > We have some internal patches for virtualizing VMCS shadowing
> > > which
> > > may break if there is a guest VMCS field with index greater than
> > > VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.
> > 
> > OK, thanks for letting us know.:-)
> 
> After careful consideration, we're actually going to drop these
> patches rather than sending them upstream.

OK.

Hi, Paolo, Sean and Jim,

Do you think our this series patch are still needed or can be dropped
as well?

Thanks.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-11-03  1:31                                           ` Robert Hoo
@ 2021-11-09 22:33                                             ` Sean Christopherson
  2021-11-10  5:35                                               ` Yu Zhang
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-11-09 22:33 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Jim Mattson, Yu Zhang, pbonzini, vkuznets, wanpengli, joro, kvm

On Wed, Nov 03, 2021, Robert Hoo wrote:
> On Fri, 2021-10-29 at 12:53 -0700, Jim Mattson wrote:
> > On Fri, Oct 8, 2021 at 5:05 PM Robert Hoo <robert.hu@linux.intel.com>
> > wrote:
> > > 
> > > On Fri, 2021-10-08 at 16:49 -0700, Jim Mattson wrote:
> > > > We have some internal patches for virtualizing VMCS shadowing
> > > > which
> > > > may break if there is a guest VMCS field with index greater than
> > > > VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.
> > > 
> > > OK, thanks for letting us know.:-)
> > 
> > After careful consideration, we're actually going to drop these
> > patches rather than sending them upstream.
> 
> OK.
> 
> Hi, Paolo, Sean and Jim,
> 
> Do you think our this series patch are still needed or can be dropped
> as well?

IMO we should drop this series and take our own erratum.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-11-09 22:33                                             ` Sean Christopherson
@ 2021-11-10  5:35                                               ` Yu Zhang
  2021-11-18  1:19                                                 ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Yu Zhang @ 2021-11-10  5:35 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Robert Hoo, Jim Mattson, pbonzini, vkuznets, wanpengli, joro, kvm

On Tue, Nov 09, 2021 at 10:33:43PM +0000, Sean Christopherson wrote:
> On Wed, Nov 03, 2021, Robert Hoo wrote:
> > On Fri, 2021-10-29 at 12:53 -0700, Jim Mattson wrote:
> > > On Fri, Oct 8, 2021 at 5:05 PM Robert Hoo <robert.hu@linux.intel.com>
> > > wrote:
> > > > 
> > > > On Fri, 2021-10-08 at 16:49 -0700, Jim Mattson wrote:
> > > > > We have some internal patches for virtualizing VMCS shadowing
> > > > > which
> > > > > may break if there is a guest VMCS field with index greater than
> > > > > VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.
> > > > 
> > > > OK, thanks for letting us know.:-)
> > > 
> > > After careful consideration, we're actually going to drop these
> > > patches rather than sending them upstream.
> > 
> > OK.
> > 
> > Hi, Paolo, Sean and Jim,
> > 
> > Do you think our this series patch are still needed or can be dropped
> > as well?
> 
> IMO we should drop this series and take our own erratum.
> 

Thanks, Sean.

Do we need a patch in kvm-unit-test to depricate the check against
the max index from MSR_IA32_VMX_VMCS_ENUM?

B.R.
Yu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-11-10  5:35                                               ` Yu Zhang
@ 2021-11-18  1:19                                                 ` Sean Christopherson
  2021-11-19  7:32                                                   ` Robert Hoo
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2021-11-18  1:19 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Robert Hoo, Jim Mattson, pbonzini, vkuznets, wanpengli, joro, kvm

On Wed, Nov 10, 2021, Yu Zhang wrote:
> On Tue, Nov 09, 2021 at 10:33:43PM +0000, Sean Christopherson wrote:
> > On Wed, Nov 03, 2021, Robert Hoo wrote:
> > > On Fri, 2021-10-29 at 12:53 -0700, Jim Mattson wrote:
> > > > On Fri, Oct 8, 2021 at 5:05 PM Robert Hoo <robert.hu@linux.intel.com>
> > > > wrote:
> > > > > 
> > > > > On Fri, 2021-10-08 at 16:49 -0700, Jim Mattson wrote:
> > > > > > We have some internal patches for virtualizing VMCS shadowing
> > > > > > which
> > > > > > may break if there is a guest VMCS field with index greater than
> > > > > > VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.
> > > > > 
> > > > > OK, thanks for letting us know.:-)
> > > > 
> > > > After careful consideration, we're actually going to drop these
> > > > patches rather than sending them upstream.
> > > 
> > > OK.
> > > 
> > > Hi, Paolo, Sean and Jim,
> > > 
> > > Do you think our this series patch are still needed or can be dropped
> > > as well?
> > 
> > IMO we should drop this series and take our own erratum.
> > 
> 
> Thanks, Sean.
> 
> Do we need a patch in kvm-unit-test to depricate the check against
> the max index from MSR_IA32_VMX_VMCS_ENUM?

Hmm, yes, unless there's an easy way to tell QEMU to not override the VMX MSRs.
I don't see any point in fighting too hard with QEMU.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap
  2021-11-18  1:19                                                 ` Sean Christopherson
@ 2021-11-19  7:32                                                   ` Robert Hoo
  0 siblings, 0 replies; 37+ messages in thread
From: Robert Hoo @ 2021-11-19  7:32 UTC (permalink / raw)
  To: Sean Christopherson, Yu Zhang
  Cc: Jim Mattson, pbonzini, vkuznets, wanpengli, joro, kvm

On Thu, 2021-11-18 at 01:19 +0000, Sean Christopherson wrote:
> On Wed, Nov 10, 2021, Yu Zhang wrote:
> > On Tue, Nov 09, 2021 at 10:33:43PM +0000, Sean Christopherson
> > wrote:
> > > On Wed, Nov 03, 2021, Robert Hoo wrote:
> > > > On Fri, 2021-10-29 at 12:53 -0700, Jim Mattson wrote:
> > > > > On Fri, Oct 8, 2021 at 5:05 PM Robert Hoo <
> > > > > robert.hu@linux.intel.com>
> > > > > wrote:
> > > > > > 
> > > > > > On Fri, 2021-10-08 at 16:49 -0700, Jim Mattson wrote:
> > > > > > > We have some internal patches for virtualizing VMCS
> > > > > > > shadowing
> > > > > > > which
> > > > > > > may break if there is a guest VMCS field with index
> > > > > > > greater than
> > > > > > > VMX_VMCS_ENUM.MAX_INDEX. I plan to upstream them soon.
> > > > > > 
> > > > > > OK, thanks for letting us know.:-)
> > > > > 
> > > > > After careful consideration, we're actually going to drop
> > > > > these
> > > > > patches rather than sending them upstream.
> > > > 
> > > > OK.
> > > > 
> > > > Hi, Paolo, Sean and Jim,
> > > > 
> > > > Do you think our this series patch are still needed or can be
> > > > dropped
> > > > as well?
> > > 
> > > IMO we should drop this series and take our own erratum.
> > > 
> > 
> > Thanks, Sean.
> > 
> > Do we need a patch in kvm-unit-test to depricate the check against
> > the max index from MSR_IA32_VMX_VMCS_ENUM?
> 
> Hmm, yes, unless there's an easy way to tell QEMU to not override the
> VMX MSRs.
> I don't see any point in fighting too hard with QEMU.

OK. I just sent out the kvm-unit-tests patch. Copied from last mail
from Yu.
https://lore.kernel.org/kvm/1637306107-92967-1-git-send-email-robert.hu@linux.intel.com/T/#u


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-11-19  7:32 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-17  9:31 [PATCH v1 0/5] KVM/x86/nVMX: Add field existence support in VMCS12 Robert Hoo
2021-08-17  9:31 ` [PATCH v1 1/5] KVM: x86: nVMX: Add vmcs12 field existence bitmap in nested_vmx Robert Hoo
2021-10-20 15:10   ` Paolo Bonzini
2021-10-21 12:41     ` Robert Hoo
2021-08-17  9:31 ` [PATCH v1 2/5] KVM: x86: nVMX: Update VMCS12 fields existence when nVMX MSRs are set Robert Hoo
2021-10-20 15:11   ` Paolo Bonzini
2021-10-21 13:08     ` Robert Hoo
2021-08-17  9:31 ` [PATCH v1 3/5] KVM: x86: nVMX: VMCS12 field's read/write respects field existence bitmap Robert Hoo
2021-08-17 15:54   ` Sean Christopherson
2021-08-18  5:50     ` Robert Hoo
2021-08-18 23:10       ` Sean Christopherson
2021-08-18 23:45         ` Jim Mattson
2021-08-18 23:49           ` Sean Christopherson
2021-08-19  9:58         ` Robert Hoo
2021-09-01 20:42           ` Sean Christopherson
2021-09-03  8:51             ` Robert Hoo
2021-09-03 15:11               ` Sean Christopherson
2021-09-28 10:05                 ` Robert Hoo
2021-10-05 16:15                   ` Sean Christopherson
2021-10-05 17:32                     ` Jim Mattson
2021-10-05 17:59                       ` Sean Christopherson
2021-10-05 20:42                         ` Jim Mattson
2021-10-05 20:50                           ` Sean Christopherson
2021-10-05 22:40                             ` Jim Mattson
2021-10-05 23:22                               ` Sean Christopherson
2021-10-08  8:23                                 ` Yu Zhang
2021-10-08 15:09                                   ` Robert Hoo
2021-10-08 23:49                                     ` Jim Mattson
2021-10-09  0:05                                       ` Robert Hoo
2021-10-29 19:53                                         ` Jim Mattson
2021-11-03  1:31                                           ` Robert Hoo
2021-11-09 22:33                                             ` Sean Christopherson
2021-11-10  5:35                                               ` Yu Zhang
2021-11-18  1:19                                                 ` Sean Christopherson
2021-11-19  7:32                                                   ` Robert Hoo
2021-08-17  9:31 ` [PATCH v1 4/5] KVM: x86: nVMX: Respect vmcs12 field existence when calc vmx_vmcs_enum_msr Robert Hoo
2021-08-17  9:31 ` [PATCH v1 5/5] KVM: x86: nVMX: Ignore user space set value to MSR_IA32_VMX_VMCS_ENUM Robert Hoo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).