KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/7]  KVM: nVMX: Optimize VMCS data copying
@ 2019-05-07 15:36 Sean Christopherson
  2019-05-07 15:36 ` [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields Sean Christopherson
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

KVM spends a lot of time copying data between VMCSes, especially when
utilizing a shadow VMCS as data needs to moved between vmcs12 and the
shadow VMCS.

This series is comprised of three mostly unrelated optimizations that
happen to modify the same code and would cause non-trivial conflicts:

  - Expose read-frequently write-rarely fields to L1 for VMREAD only.
    Exposing fields to L1 for both VMWRITE and VMREAD means KVM needs
    to copy data from the shadow VMCS to vmcs12 on nested VM-Entry.
    For fields that are almost never written by L1, copying those
    fields on every nested VM-Entry is pure overhead.

  - Track the vmcs12 offsets for shadowed fields.  All offsets are
    known at compile time (HIGH fields complicate this slightly), but
    KVM currently does a runtime lookup to get the offset, which adds
    measurable latency to copying to/from the shadow VMCS.

  - Sync rarely accessed guest fields from vmcs02 to vmcs12 only when
    necessary.  A non-trivial number of guest fields are infrequently
    accessed by VMMs, e.g. most segment descriptor fields.  Avoid
    copying the fields from vmcs02 (30+ VMREADs) on every nested VM-Exit
    to L1, and instead pull them from vmcs02 when read by L1, or when
    they may be consumed by KVM, e.g. for consistency checks.


Sean Christopherson (7):
  KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields
  KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES
  KVM: nVMX: Track vmcs12 offsets for shadowed VMCS fields
  KVM: nVMX: Lift sync_vmcs12() out of prepare_vmcs12()
  KVM: nVMX: Use descriptive names for VMCS sync functions and flags
  KVM: nVMX: Add helpers to identify shadowed VMCS fields
  KVM: nVMX: Sync rarely accessed guest fields only when needed

 arch/x86/kvm/vmx/nested.c             | 384 +++++++++++++++++---------
 arch/x86/kvm/vmx/nested.h             |   2 +-
 arch/x86/kvm/vmx/vmcs12.h             |  57 ++--
 arch/x86/kvm/vmx/vmcs_shadow_fields.h |  78 +++---
 arch/x86/kvm/vmx/vmx.c                |   4 +-
 arch/x86/kvm/vmx/vmx.h                |   8 +-
 6 files changed, 320 insertions(+), 213 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields
  2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
@ 2019-05-07 15:36 ` Sean Christopherson
  2019-06-06 13:26   ` Paolo Bonzini
  2019-06-13 17:02   ` Jim Mattson
  2019-05-07 15:36 ` [PATCH 2/7] KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES Sean Christopherson
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

Allowing L1 to VMWRITE read-only fields is only beneficial in a double
nesting scenario, e.g. no sane VMM will VMWRITE VM_EXIT_REASON in normal
non-nested operation.  Intercepting RO fields means KVM doesn't need to
sync them from the shadow VMCS to vmcs12 when running L2.  The obvious
downside is that L1 will VM-Exit more often when running L3, but it's
likely safe to assume most folks would happily sacrifice a bit of L3
performance, which may not even be noticeable in the grande scheme, to
improve L2 performance across the board.

Not intercepting fields tagged read-only also allows for additional
optimizations, e.g. marking GUEST_{CS,SS}_AR_BYTES as SHADOW_FIELD_RO
since those fields are rarely written by a VMMs, but read frequently.

When utilizing a shadow VMCS with asymmetric R/W and R/O bitmaps, fields
that cause VM-Exit on VMWRITE but not VMREAD need to be propagated to
the shadow VMCS during VMWRITE emulation, otherwise a subsequence VMREAD
from L1 will consume a stale value.

Note, KVM currently utilizes asymmetric bitmaps when "VMWRITE any field"
is not exposed to L1, but only so that it can reject the VMWRITE, i.e.
propagating the VMWRITE to the shadow VMCS is a new requirement, not a
bug fix.

Eliminating the copying of RO fields reduces the latency of nested
VM-Entry (copy_shadow_to_vmcs12()) by ~100 cycles (plus 40-50 cycles
if/when the AR_BYTES fields are exposed RO).

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 72 +++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 04b40a98f60b..1f7c4af70903 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1102,14 +1102,6 @@ static int vmx_restore_vmx_misc(struct vcpu_vmx *vmx, u64 data)
 	vmx->nested.msrs.misc_low = data;
 	vmx->nested.msrs.misc_high = data >> 32;
 
-	/*
-	 * If L1 has read-only VM-exit information fields, use the
-	 * less permissive vmx_vmwrite_bitmap to specify write
-	 * permissions for the shadow VMCS.
-	 */
-	if (enable_shadow_vmcs && !nested_cpu_has_vmwrite_any_field(&vmx->vcpu))
-		vmcs_write64(VMWRITE_BITMAP, __pa(vmx_vmwrite_bitmap));
-
 	return 0;
 }
 
@@ -1298,41 +1290,27 @@ int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata)
 }
 
 /*
- * Copy the writable VMCS shadow fields back to the VMCS12, in case
- * they have been modified by the L1 guest. Note that the "read-only"
- * VM-exit information fields are actually writable if the vCPU is
- * configured to support "VMWRITE to any supported field in the VMCS."
+ * Copy the writable VMCS shadow fields back to the VMCS12, in case they have
+ * been modified by the L1 guest.  Note, "writable" in this context means
+ * "writable by the guest", i.e. tagged SHADOW_FIELD_RW.  Note #2, the set of
+ * fields tagged SHADOW_FIELD_RO may or may not align with the "read-only"
+ * VM-exit information fields (which are actually writable if the vCPU is
+ * configured to support "VMWRITE to any supported field in the VMCS").
  */
 static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
 {
-	const u16 *fields[] = {
-		shadow_read_write_fields,
-		shadow_read_only_fields
-	};
-	const int max_fields[] = {
-		max_shadow_read_write_fields,
-		max_shadow_read_only_fields
-	};
-	int i, q;
-	unsigned long field;
-	u64 field_value;
 	struct vmcs *shadow_vmcs = vmx->vmcs01.shadow_vmcs;
+	struct vmcs12 *vmcs12 = get_vmcs12(&vmx->vcpu);
+	unsigned long field;
+	int i;
 
 	preempt_disable();
 
 	vmcs_load(shadow_vmcs);
 
-	for (q = 0; q < ARRAY_SIZE(fields); q++) {
-		for (i = 0; i < max_fields[q]; i++) {
-			field = fields[q][i];
-			field_value = __vmcs_readl(field);
-			vmcs12_write_any(get_vmcs12(&vmx->vcpu), field, field_value);
-		}
-		/*
-		 * Skip the VM-exit information fields if they are read-only.
-		 */
-		if (!nested_cpu_has_vmwrite_any_field(&vmx->vcpu))
-			break;
+	for (i = 0; i < max_shadow_read_write_fields; i++) {
+		field = shadow_read_write_fields[i];
+		vmcs12_write_any(vmcs12, field, __vmcs_readl(field));
 	}
 
 	vmcs_clear(shadow_vmcs);
@@ -4511,6 +4489,24 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 			 * path of prepare_vmcs02.
 			 */
 			break;
+
+#define SHADOW_FIELD_RO(x) case x:
+#include "vmcs_shadow_fields.h"
+			/*
+			 * L1 can read these fields without exiting, ensure the
+			 * shadow VMCS is up-to-date.
+			 */
+			if (enable_shadow_vmcs) {
+				preempt_disable();
+				vmcs_load(vmx->vmcs01.shadow_vmcs);
+
+				__vmcs_writel(field, field_value);
+
+				vmcs_clear(vmx->vmcs01.shadow_vmcs);
+				vmcs_load(vmx->loaded_vmcs->vmcs);
+				preempt_enable();
+			}
+			/* fall through */
 		default:
 			vmx->nested.dirty_vmcs12 = true;
 			break;
@@ -5456,14 +5452,8 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 void nested_vmx_vcpu_setup(void)
 {
 	if (enable_shadow_vmcs) {
-		/*
-		 * At vCPU creation, "VMWRITE to any supported field
-		 * in the VMCS" is supported, so use the more
-		 * permissive vmx_vmread_bitmap to specify both read
-		 * and write permissions for the shadow VMCS.
-		 */
 		vmcs_write64(VMREAD_BITMAP, __pa(vmx_vmread_bitmap));
-		vmcs_write64(VMWRITE_BITMAP, __pa(vmx_vmread_bitmap));
+		vmcs_write64(VMWRITE_BITMAP, __pa(vmx_vmwrite_bitmap));
 	}
 }
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/7] KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES
  2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
  2019-05-07 15:36 ` [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields Sean Christopherson
@ 2019-05-07 15:36 ` Sean Christopherson
  2019-06-06 13:31   ` Paolo Bonzini
  2019-05-07 15:36 ` [PATCH 3/7] KVM: nVMX: Track vmcs12 offsets for shadowed VMCS fields Sean Christopherson
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

VMMs frequently read the guest's CS and SS AR bytes to detect 64-bit
mode and CPL respectively, but effectively never write said fields once
the VM is initialized.  Intercepting VMWRITEs for the two fields saves
~55 cycles in copy_shadow_to_vmcs12().

Because some Intel CPUs, e.g. Haswell, drop the reserved bits of the
guest access rights fields on VMWRITE, exposing the fields to L1 for
VMREAD but not VMWRITE leads to inconsistent behavior between L1 and L2.
On hardware that drops the bits, L1 will see the stripped down value due
to reading the value from hardware, while L2 will see the full original
value as stored by KVM.  To avoid such an inconsistency, emulate the
behavior on all CPUS, but only for intercepted VMWRITEs so as to avoid
introducing pointless latency into copy_shadow_to_vmcs12(), e.g. if the
emulation were added to vmcs12_write_any().

Since the AR_BYTES emulation is done only for intercepted VMWRITE, if a
future patch (re)exposed AR_BYTES for both VMWRITE and VMREAD, then KVM
would end up with incosistent behavior on pre-Haswell hardware, e.g. KVM
would drop the reserved bits on intercepted VMWRITE, but direct VMWRITE
to the shadow VMCS would not drop the bits.  Add a WARN in the shadow
field initialization to detect any attempt to expose an AR_BYTES field
without updating vmcs12_write_any().

Note, emulation of the AR_BYTES reserved bit behavior is based on a
patch[1] from Jim Mattson that applied the emulation to all writes to
vmcs12 so that live migration across different generations of hardware
would not introduce divergent behavior.  But given that live migration
of nested state has already been enabled, that ship has sailed (not to
mention that no sane VMM will be affected by this behavior).

[1] https://patchwork.kernel.org/patch/10483321/

Cc: Jim Mattson <jmattson@google.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c             | 15 +++++++++++++++
 arch/x86/kvm/vmx/vmcs_shadow_fields.h |  4 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 1f7c4af70903..9310c6db9e80 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -91,6 +91,10 @@ static void init_vmcs_shadow_fields(void)
 			pr_err("Missing field from shadow_read_write_field %x\n",
 			       field + 1);
 
+		WARN_ONCE(field >= GUEST_ES_AR_BYTES &&
+			  field <= GUEST_TR_AR_BYTES,
+			  "Update vmcs12_write_any() to expose AR_BYTES RW");
+
 		/*
 		 * PML and the preemption timer can be emulated, but the
 		 * processor cannot vmwrite to fields that don't exist
@@ -4471,6 +4475,17 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 		vmcs12 = get_shadow_vmcs12(vcpu);
 	}
 
+	/*
+	 * Some Intel CPUs intentionally drop the reserved bits of the AR byte
+	 * fields on VMWRITE.  Emulate this behavior to ensure consistent KVM
+	 * behavior regardless of the underlying hardware, e.g. if an AR_BYTE
+	 * field is intercepted for VMWRITE but not VMREAD (in L1), then VMREAD
+	 * from L1 will return a different value than VMREAD from L2 (L1 sees
+	 * the stripped down value, L2 sees the full value as stored by KVM).
+	 */
+	if (field >= GUEST_ES_AR_BYTES && field <= GUEST_TR_AR_BYTES)
+		field_value &= 0x1f0ff;
+
 	if (vmcs12_write_any(vmcs12, field, field_value) < 0)
 		return nested_vmx_failValid(vcpu,
 			VMXERR_UNSUPPORTED_VMCS_COMPONENT);
diff --git a/arch/x86/kvm/vmx/vmcs_shadow_fields.h b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
index 132432f375c2..97dd5295be31 100644
--- a/arch/x86/kvm/vmx/vmcs_shadow_fields.h
+++ b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
@@ -40,14 +40,14 @@ SHADOW_FIELD_RO(VM_EXIT_INSTRUCTION_LEN)
 SHADOW_FIELD_RO(IDT_VECTORING_INFO_FIELD)
 SHADOW_FIELD_RO(IDT_VECTORING_ERROR_CODE)
 SHADOW_FIELD_RO(VM_EXIT_INTR_ERROR_CODE)
+SHADOW_FIELD_RO(GUEST_CS_AR_BYTES)
+SHADOW_FIELD_RO(GUEST_SS_AR_BYTES)
 SHADOW_FIELD_RW(CPU_BASED_VM_EXEC_CONTROL)
 SHADOW_FIELD_RW(EXCEPTION_BITMAP)
 SHADOW_FIELD_RW(VM_ENTRY_EXCEPTION_ERROR_CODE)
 SHADOW_FIELD_RW(VM_ENTRY_INTR_INFO_FIELD)
 SHADOW_FIELD_RW(VM_ENTRY_INSTRUCTION_LEN)
 SHADOW_FIELD_RW(TPR_THRESHOLD)
-SHADOW_FIELD_RW(GUEST_CS_AR_BYTES)
-SHADOW_FIELD_RW(GUEST_SS_AR_BYTES)
 SHADOW_FIELD_RW(GUEST_INTERRUPTIBILITY_INFO)
 SHADOW_FIELD_RW(VMX_PREEMPTION_TIMER_VALUE)
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/7] KVM: nVMX: Track vmcs12 offsets for shadowed VMCS fields
  2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
  2019-05-07 15:36 ` [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields Sean Christopherson
  2019-05-07 15:36 ` [PATCH 2/7] KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES Sean Christopherson
@ 2019-05-07 15:36 ` Sean Christopherson
  2019-05-07 15:36 ` [PATCH 4/7] KVM: nVMX: Lift sync_vmcs12() out of prepare_vmcs12() Sean Christopherson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

The vmcs12 fields offsets are constant and known at compile time.  Store
the associated offset for each shadowed field to avoid the costly lookup
in vmcs_field_to_offset() when copying between vmcs12 and the shadow
VMCS.  Avoiding the costly lookup reduces the latency of copying by
~100 cycles in each direction.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c             | 96 +++++++++++++++------------
 arch/x86/kvm/vmx/vmcs12.h             | 57 +++++-----------
 arch/x86/kvm/vmx/vmcs_shadow_fields.h | 74 ++++++++++-----------
 3 files changed, 108 insertions(+), 119 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9310c6db9e80..4528c8ff3c9c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -41,15 +41,19 @@ static unsigned long *vmx_bitmap[VMX_BITMAP_NR];
 #define vmx_vmread_bitmap                    (vmx_bitmap[VMX_VMREAD_BITMAP])
 #define vmx_vmwrite_bitmap                   (vmx_bitmap[VMX_VMWRITE_BITMAP])
 
-static u16 shadow_read_only_fields[] = {
-#define SHADOW_FIELD_RO(x) x,
+struct shadow_vmcs_field {
+	u16	encoding;
+	u16	offset;
+};
+static struct shadow_vmcs_field shadow_read_only_fields[] = {
+#define SHADOW_FIELD_RO(x, y) { x, offsetof(struct vmcs12, y) },
 #include "vmcs_shadow_fields.h"
 };
 static int max_shadow_read_only_fields =
 	ARRAY_SIZE(shadow_read_only_fields);
 
-static u16 shadow_read_write_fields[] = {
-#define SHADOW_FIELD_RW(x) x,
+static struct shadow_vmcs_field shadow_read_write_fields[] = {
+#define SHADOW_FIELD_RW(x, y) { x, offsetof(struct vmcs12, y) },
 #include "vmcs_shadow_fields.h"
 };
 static int max_shadow_read_write_fields =
@@ -63,31 +67,33 @@ static void init_vmcs_shadow_fields(void)
 	memset(vmx_vmwrite_bitmap, 0xff, PAGE_SIZE);
 
 	for (i = j = 0; i < max_shadow_read_only_fields; i++) {
-		u16 field = shadow_read_only_fields[i];
+		struct shadow_vmcs_field entry = shadow_read_only_fields[i];
+		u16 field = entry.encoding;
 
 		if (vmcs_field_width(field) == VMCS_FIELD_WIDTH_U64 &&
 		    (i + 1 == max_shadow_read_only_fields ||
-		     shadow_read_only_fields[i + 1] != field + 1))
+		     shadow_read_only_fields[i + 1].encoding != field + 1))
 			pr_err("Missing field from shadow_read_only_field %x\n",
 			       field + 1);
 
 		clear_bit(field, vmx_vmread_bitmap);
-#ifdef CONFIG_X86_64
 		if (field & 1)
+#ifdef CONFIG_X86_64
 			continue;
+#else
+			entry.offset += sizeof(u32);
 #endif
-		if (j < i)
-			shadow_read_only_fields[j] = field;
-		j++;
+		shadow_read_only_fields[j++] = entry;
 	}
 	max_shadow_read_only_fields = j;
 
 	for (i = j = 0; i < max_shadow_read_write_fields; i++) {
-		u16 field = shadow_read_write_fields[i];
+		struct shadow_vmcs_field entry = shadow_read_write_fields[i];
+		u16 field = entry.encoding;
 
 		if (vmcs_field_width(field) == VMCS_FIELD_WIDTH_U64 &&
 		    (i + 1 == max_shadow_read_write_fields ||
-		     shadow_read_write_fields[i + 1] != field + 1))
+		     shadow_read_write_fields[i + 1].encoding != field + 1))
 			pr_err("Missing field from shadow_read_write_field %x\n",
 			       field + 1);
 
@@ -119,13 +125,13 @@ static void init_vmcs_shadow_fields(void)
 
 		clear_bit(field, vmx_vmwrite_bitmap);
 		clear_bit(field, vmx_vmread_bitmap);
-#ifdef CONFIG_X86_64
 		if (field & 1)
+#ifdef CONFIG_X86_64
 			continue;
+#else
+			entry.offset += sizeof(u32);
 #endif
-		if (j < i)
-			shadow_read_write_fields[j] = field;
-		j++;
+		shadow_read_write_fields[j++] = entry;
 	}
 	max_shadow_read_write_fields = j;
 }
@@ -1305,7 +1311,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
 {
 	struct vmcs *shadow_vmcs = vmx->vmcs01.shadow_vmcs;
 	struct vmcs12 *vmcs12 = get_vmcs12(&vmx->vcpu);
-	unsigned long field;
+	struct shadow_vmcs_field field;
+	unsigned long val;
 	int i;
 
 	preempt_disable();
@@ -1314,7 +1321,8 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
 
 	for (i = 0; i < max_shadow_read_write_fields; i++) {
 		field = shadow_read_write_fields[i];
-		vmcs12_write_any(vmcs12, field, __vmcs_readl(field));
+		val = __vmcs_readl(field.encoding);
+		vmcs12_write_any(vmcs12, field.encoding, field.offset, val);
 	}
 
 	vmcs_clear(shadow_vmcs);
@@ -1325,7 +1333,7 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
 
 static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
 {
-	const u16 *fields[] = {
+	const struct shadow_vmcs_field *fields[] = {
 		shadow_read_write_fields,
 		shadow_read_only_fields
 	};
@@ -1333,18 +1341,20 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
 		max_shadow_read_write_fields,
 		max_shadow_read_only_fields
 	};
+	struct vmcs *shadow_vmcs = vmx->vmcs01.shadow_vmcs;
+	struct vmcs12 *vmcs12 = get_vmcs12(&vmx->vcpu);
+	struct shadow_vmcs_field field;
+	unsigned long val;
 	int i, q;
-	unsigned long field;
-	u64 field_value = 0;
-	struct vmcs *shadow_vmcs = vmx->vmcs01.shadow_vmcs;
 
 	vmcs_load(shadow_vmcs);
 
 	for (q = 0; q < ARRAY_SIZE(fields); q++) {
 		for (i = 0; i < max_fields[q]; i++) {
 			field = fields[q][i];
-			vmcs12_read_any(get_vmcs12(&vmx->vcpu), field, &field_value);
-			__vmcs_writel(field, field_value);
+			val = vmcs12_read_any(vmcs12, field.encoding,
+					      field.offset);
+			__vmcs_writel(field.encoding, val);
 		}
 	}
 
@@ -2141,6 +2151,8 @@ static void prepare_vmcs02_full(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
 		vmcs_write32(GUEST_TR_LIMIT, vmcs12->guest_tr_limit);
 		vmcs_write32(GUEST_GDTR_LIMIT, vmcs12->guest_gdtr_limit);
 		vmcs_write32(GUEST_IDTR_LIMIT, vmcs12->guest_idtr_limit);
+		vmcs_write32(GUEST_CS_AR_BYTES, vmcs12->guest_cs_ar_bytes);
+		vmcs_write32(GUEST_SS_AR_BYTES, vmcs12->guest_ss_ar_bytes);
 		vmcs_write32(GUEST_ES_AR_BYTES, vmcs12->guest_es_ar_bytes);
 		vmcs_write32(GUEST_DS_AR_BYTES, vmcs12->guest_ds_ar_bytes);
 		vmcs_write32(GUEST_FS_AR_BYTES, vmcs12->guest_fs_ar_bytes);
@@ -2237,23 +2249,12 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 			  u32 *entry_failure_code)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
-	struct hv_enlightened_vmcs *hv_evmcs = vmx->nested.hv_evmcs;
 
 	if (vmx->nested.dirty_vmcs12 || vmx->nested.hv_evmcs) {
 		prepare_vmcs02_full(vmx, vmcs12);
 		vmx->nested.dirty_vmcs12 = false;
 	}
 
-	/*
-	 * First, the fields that are shadowed.  This must be kept in sync
-	 * with vmcs_shadow_fields.h.
-	 */
-	if (!hv_evmcs || !(hv_evmcs->hv_clean_fields &
-			   HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP2)) {
-		vmcs_write32(GUEST_CS_AR_BYTES, vmcs12->guest_cs_ar_bytes);
-		vmcs_write32(GUEST_SS_AR_BYTES, vmcs12->guest_ss_ar_bytes);
-	}
-
 	if (vmx->nested.nested_run_pending &&
 	    (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
 		kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
@@ -4367,6 +4368,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
 	u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
 	gva_t gva = 0;
 	struct vmcs12 *vmcs12;
+	short offset;
 
 	if (!nested_vmx_check_permission(vcpu))
 		return 1;
@@ -4388,10 +4390,14 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
 
 	/* Decode instruction info and find the field to read */
 	field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf));
+
+	offset = vmcs_field_to_offset(field);
+	if (offset < 0)
+		return nested_vmx_failValid(vcpu,
+			VMXERR_UNSUPPORTED_VMCS_COMPONENT);
+
 	/* Read the field, zero-extended to a u64 field_value */
-	if (vmcs12_read_any(vmcs12, field, &field_value) < 0)
-		return nested_vmx_failValid(vcpu,
-			VMXERR_UNSUPPORTED_VMCS_COMPONENT);
+	field_value = vmcs12_read_any(vmcs12, field, offset);
 
 	/*
 	 * Now copy part of this value to register or memory, as requested.
@@ -4431,6 +4437,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 	u64 field_value = 0;
 	struct x86_exception e;
 	struct vmcs12 *vmcs12;
+	short offset;
 
 	if (!nested_vmx_check_permission(vcpu))
 		return 1;
@@ -4475,6 +4482,11 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 		vmcs12 = get_shadow_vmcs12(vcpu);
 	}
 
+	offset = vmcs_field_to_offset(field);
+	if (offset < 0)
+		return nested_vmx_failValid(vcpu,
+			VMXERR_UNSUPPORTED_VMCS_COMPONENT);
+
 	/*
 	 * Some Intel CPUs intentionally drop the reserved bits of the AR byte
 	 * fields on VMWRITE.  Emulate this behavior to ensure consistent KVM
@@ -4486,9 +4498,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 	if (field >= GUEST_ES_AR_BYTES && field <= GUEST_TR_AR_BYTES)
 		field_value &= 0x1f0ff;
 
-	if (vmcs12_write_any(vmcs12, field, field_value) < 0)
-		return nested_vmx_failValid(vcpu,
-			VMXERR_UNSUPPORTED_VMCS_COMPONENT);
+	vmcs12_write_any(vmcs12, field, offset, field_value);
 
 	/*
 	 * Do not track vmcs12 dirty-state if in guest-mode
@@ -4496,7 +4506,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 	 */
 	if (!is_guest_mode(vcpu)) {
 		switch (field) {
-#define SHADOW_FIELD_RW(x) case x:
+#define SHADOW_FIELD_RW(x, y) case x:
 #include "vmcs_shadow_fields.h"
 			/*
 			 * The fields that can be updated by L1 without a vmexit are
@@ -4505,7 +4515,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 			 */
 			break;
 
-#define SHADOW_FIELD_RO(x) case x:
+#define SHADOW_FIELD_RO(x, y) case x:
 #include "vmcs_shadow_fields.h"
 			/*
 			 * L1 can read these fields without exiting, ensure the
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 3a742428ad17..9cd26099fcc0 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -394,69 +394,48 @@ static inline short vmcs_field_to_offset(unsigned long field)
 
 #undef ROL16
 
-/*
- * Read a vmcs12 field. Since these can have varying lengths and we return
- * one type, we chose the biggest type (u64) and zero-extend the return value
- * to that size. Note that the caller, handle_vmread, might need to use only
- * some of the bits we return here (e.g., on 32-bit guests, only 32 bits of
- * 64-bit fields are to be returned).
- */
-static inline int vmcs12_read_any(struct vmcs12 *vmcs12,
-				  unsigned long field, u64 *ret)
+static inline u64 vmcs12_read_any(struct vmcs12 *vmcs12, unsigned long field,
+				  u16 offset)
 {
-	short offset = vmcs_field_to_offset(field);
-	char *p;
-
-	if (offset < 0)
-		return offset;
-
-	p = (char *)vmcs12 + offset;
+	char *p = (char *)vmcs12 + offset;
 
 	switch (vmcs_field_width(field)) {
 	case VMCS_FIELD_WIDTH_NATURAL_WIDTH:
-		*ret = *((natural_width *)p);
-		return 0;
+		return *((natural_width *)p);
 	case VMCS_FIELD_WIDTH_U16:
-		*ret = *((u16 *)p);
-		return 0;
+		return *((u16 *)p);
 	case VMCS_FIELD_WIDTH_U32:
-		*ret = *((u32 *)p);
-		return 0;
+		return *((u32 *)p);
 	case VMCS_FIELD_WIDTH_U64:
-		*ret = *((u64 *)p);
-		return 0;
+		return *((u64 *)p);
 	default:
-		WARN_ON(1);
-		return -ENOENT;
+		WARN_ON_ONCE(1);
+		return -1;
 	}
 }
 
-static inline int vmcs12_write_any(struct vmcs12 *vmcs12,
-				   unsigned long field, u64 field_value){
-	short offset = vmcs_field_to_offset(field);
+static inline void vmcs12_write_any(struct vmcs12 *vmcs12, unsigned long field,
+				    u16 offset, u64 field_value)
+{
 	char *p = (char *)vmcs12 + offset;
 
-	if (offset < 0)
-		return offset;
-
 	switch (vmcs_field_width(field)) {
 	case VMCS_FIELD_WIDTH_U16:
 		*(u16 *)p = field_value;
-		return 0;
+		break;
 	case VMCS_FIELD_WIDTH_U32:
 		*(u32 *)p = field_value;
-		return 0;
+		break;
 	case VMCS_FIELD_WIDTH_U64:
 		*(u64 *)p = field_value;
-		return 0;
+		break;
 	case VMCS_FIELD_WIDTH_NATURAL_WIDTH:
 		*(natural_width *)p = field_value;
-		return 0;
+		break;
 	default:
-		WARN_ON(1);
-		return -ENOENT;
+		WARN_ON_ONCE(1);
+		break;
 	}
-
 }
 
 #endif /* __KVM_X86_VMX_VMCS12_H */
diff --git a/arch/x86/kvm/vmx/vmcs_shadow_fields.h b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
index 97dd5295be31..2cfa19ca158e 100644
--- a/arch/x86/kvm/vmx/vmcs_shadow_fields.h
+++ b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
@@ -1,8 +1,8 @@
 #ifndef SHADOW_FIELD_RO
-#define SHADOW_FIELD_RO(x)
+#define SHADOW_FIELD_RO(x, y)
 #endif
 #ifndef SHADOW_FIELD_RW
-#define SHADOW_FIELD_RW(x)
+#define SHADOW_FIELD_RW(x, y)
 #endif
 
 /*
@@ -28,47 +28,47 @@
  */
 
 /* 16-bits */
-SHADOW_FIELD_RW(GUEST_INTR_STATUS)
-SHADOW_FIELD_RW(GUEST_PML_INDEX)
-SHADOW_FIELD_RW(HOST_FS_SELECTOR)
-SHADOW_FIELD_RW(HOST_GS_SELECTOR)
+SHADOW_FIELD_RW(GUEST_INTR_STATUS, guest_intr_status)
+SHADOW_FIELD_RW(GUEST_PML_INDEX, guest_pml_index)
+SHADOW_FIELD_RW(HOST_FS_SELECTOR, host_fs_selector)
+SHADOW_FIELD_RW(HOST_GS_SELECTOR, host_gs_selector)
 
 /* 32-bits */
-SHADOW_FIELD_RO(VM_EXIT_REASON)
-SHADOW_FIELD_RO(VM_EXIT_INTR_INFO)
-SHADOW_FIELD_RO(VM_EXIT_INSTRUCTION_LEN)
-SHADOW_FIELD_RO(IDT_VECTORING_INFO_FIELD)
-SHADOW_FIELD_RO(IDT_VECTORING_ERROR_CODE)
-SHADOW_FIELD_RO(VM_EXIT_INTR_ERROR_CODE)
-SHADOW_FIELD_RO(GUEST_CS_AR_BYTES)
-SHADOW_FIELD_RO(GUEST_SS_AR_BYTES)
-SHADOW_FIELD_RW(CPU_BASED_VM_EXEC_CONTROL)
-SHADOW_FIELD_RW(EXCEPTION_BITMAP)
-SHADOW_FIELD_RW(VM_ENTRY_EXCEPTION_ERROR_CODE)
-SHADOW_FIELD_RW(VM_ENTRY_INTR_INFO_FIELD)
-SHADOW_FIELD_RW(VM_ENTRY_INSTRUCTION_LEN)
-SHADOW_FIELD_RW(TPR_THRESHOLD)
-SHADOW_FIELD_RW(GUEST_INTERRUPTIBILITY_INFO)
-SHADOW_FIELD_RW(VMX_PREEMPTION_TIMER_VALUE)
+SHADOW_FIELD_RO(VM_EXIT_REASON, vm_exit_reason)
+SHADOW_FIELD_RO(VM_EXIT_INTR_INFO, vm_exit_intr_info)
+SHADOW_FIELD_RO(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len)
+SHADOW_FIELD_RO(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field)
+SHADOW_FIELD_RO(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code)
+SHADOW_FIELD_RO(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code)
+SHADOW_FIELD_RO(GUEST_CS_AR_BYTES, guest_cs_ar_bytes)
+SHADOW_FIELD_RO(GUEST_SS_AR_BYTES, guest_ss_ar_bytes)
+SHADOW_FIELD_RW(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control)
+SHADOW_FIELD_RW(EXCEPTION_BITMAP, exception_bitmap)
+SHADOW_FIELD_RW(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code)
+SHADOW_FIELD_RW(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field)
+SHADOW_FIELD_RW(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len)
+SHADOW_FIELD_RW(TPR_THRESHOLD, tpr_threshold)
+SHADOW_FIELD_RW(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info)
+SHADOW_FIELD_RW(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value)
 
 /* Natural width */
-SHADOW_FIELD_RO(EXIT_QUALIFICATION)
-SHADOW_FIELD_RO(GUEST_LINEAR_ADDRESS)
-SHADOW_FIELD_RW(GUEST_RIP)
-SHADOW_FIELD_RW(GUEST_RSP)
-SHADOW_FIELD_RW(GUEST_CR0)
-SHADOW_FIELD_RW(GUEST_CR3)
-SHADOW_FIELD_RW(GUEST_CR4)
-SHADOW_FIELD_RW(GUEST_RFLAGS)
-SHADOW_FIELD_RW(CR0_GUEST_HOST_MASK)
-SHADOW_FIELD_RW(CR0_READ_SHADOW)
-SHADOW_FIELD_RW(CR4_READ_SHADOW)
-SHADOW_FIELD_RW(HOST_FS_BASE)
-SHADOW_FIELD_RW(HOST_GS_BASE)
+SHADOW_FIELD_RO(EXIT_QUALIFICATION, exit_qualification)
+SHADOW_FIELD_RO(GUEST_LINEAR_ADDRESS, guest_linear_address)
+SHADOW_FIELD_RW(GUEST_RIP, guest_rip)
+SHADOW_FIELD_RW(GUEST_RSP, guest_rsp)
+SHADOW_FIELD_RW(GUEST_CR0, guest_cr0)
+SHADOW_FIELD_RW(GUEST_CR3, guest_cr3)
+SHADOW_FIELD_RW(GUEST_CR4, guest_cr4)
+SHADOW_FIELD_RW(GUEST_RFLAGS, guest_rflags)
+SHADOW_FIELD_RW(CR0_GUEST_HOST_MASK, cr0_guest_host_mask)
+SHADOW_FIELD_RW(CR0_READ_SHADOW, cr0_read_shadow)
+SHADOW_FIELD_RW(CR4_READ_SHADOW, cr4_read_shadow)
+SHADOW_FIELD_RW(HOST_FS_BASE, host_fs_base)
+SHADOW_FIELD_RW(HOST_GS_BASE, host_gs_base)
 
 /* 64-bit */
-SHADOW_FIELD_RO(GUEST_PHYSICAL_ADDRESS)
-SHADOW_FIELD_RO(GUEST_PHYSICAL_ADDRESS_HIGH)
+SHADOW_FIELD_RO(GUEST_PHYSICAL_ADDRESS, guest_physical_address)
+SHADOW_FIELD_RO(GUEST_PHYSICAL_ADDRESS_HIGH, guest_physical_address)
 
 #undef SHADOW_FIELD_RO
 #undef SHADOW_FIELD_RW
-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 4/7] KVM: nVMX: Lift sync_vmcs12() out of prepare_vmcs12()
  2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
                   ` (2 preceding siblings ...)
  2019-05-07 15:36 ` [PATCH 3/7] KVM: nVMX: Track vmcs12 offsets for shadowed VMCS fields Sean Christopherson
@ 2019-05-07 15:36 ` Sean Christopherson
  2019-05-07 15:36 ` [PATCH 5/7] KVM: nVMX: Use descriptive names for VMCS sync functions and flags Sean Christopherson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

... to make it more obvious that sync_vmcs12() is invoked on all nested
VM-Exits, e.g. hiding sync_vmcs12() in prepare_vmcs12() makes it appear
that guest state is NOT propagated to vmcs12 for a normal VM-Exit.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4528c8ff3c9c..0eee7894b453 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3497,11 +3497,7 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 			   u32 exit_reason, u32 exit_intr_info,
 			   unsigned long exit_qualification)
 {
-	/* update guest state fields: */
-	sync_vmcs12(vcpu, vmcs12);
-
 	/* update exit information fields: */
-
 	vmcs12->vm_exit_reason = exit_reason;
 	vmcs12->exit_qualification = exit_qualification;
 	vmcs12->vm_exit_intr_info = exit_intr_info;
@@ -3862,9 +3858,9 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 		vcpu->arch.tsc_offset -= vmcs12->tsc_offset;
 
 	if (likely(!vmx->fail)) {
-		if (exit_reason == -1)
-			sync_vmcs12(vcpu, vmcs12);
-		else
+		sync_vmcs12(vcpu, vmcs12);
+
+		if (exit_reason != -1)
 			prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
 				       exit_qualification);
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5/7] KVM: nVMX: Use descriptive names for VMCS sync functions and flags
  2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
                   ` (3 preceding siblings ...)
  2019-05-07 15:36 ` [PATCH 4/7] KVM: nVMX: Lift sync_vmcs12() out of prepare_vmcs12() Sean Christopherson
@ 2019-05-07 15:36 ` Sean Christopherson
  2019-05-07 15:36 ` [PATCH 6/7] KVM: nVMX: Add helpers to identify shadowed VMCS fields Sean Christopherson
  2019-05-07 15:36 ` [PATCH 7/7] KVM: nVMX: Sync rarely accessed guest fields only when needed Sean Christopherson
  6 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

Nested virtualization involves copying data between many different types
of VMCSes, e.g. vmcs02, vmcs12, shadow VMCS and eVMCS.  Rename a variety
of functions and flags to document both the source and destination of
each sync.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 28 ++++++++++++++--------------
 arch/x86/kvm/vmx/nested.h |  2 +-
 arch/x86/kvm/vmx/vmx.c    |  4 ++--
 arch/x86/kvm/vmx/vmx.h    |  2 +-
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 0eee7894b453..632e7e4324f3 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1612,7 +1612,7 @@ static int copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
 	 * evmcs->host_gdtr_base = vmcs12->host_gdtr_base;
 	 * evmcs->host_idtr_base = vmcs12->host_idtr_base;
 	 * evmcs->host_rsp = vmcs12->host_rsp;
-	 * sync_vmcs12() doesn't read these:
+	 * sync_vmcs02_to_vmcs12() doesn't read these:
 	 * evmcs->io_bitmap_a = vmcs12->io_bitmap_a;
 	 * evmcs->io_bitmap_b = vmcs12->io_bitmap_b;
 	 * evmcs->msr_bitmap = vmcs12->msr_bitmap;
@@ -1836,7 +1836,7 @@ static int nested_vmx_handle_enlightened_vmptrld(struct kvm_vcpu *vcpu,
 	return 1;
 }
 
-void nested_sync_from_vmcs12(struct kvm_vcpu *vcpu)
+void nested_sync_vmcs12_to_shadow(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
@@ -1857,7 +1857,7 @@ void nested_sync_from_vmcs12(struct kvm_vcpu *vcpu)
 		copy_vmcs12_to_shadow(vmx);
 	}
 
-	vmx->nested.need_vmcs12_sync = false;
+	vmx->nested.need_vmcs12_to_shadow_sync = false;
 }
 
 static enum hrtimer_restart vmx_preemption_timer_fn(struct hrtimer *timer)
@@ -3039,7 +3039,7 @@ int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry)
 	vmcs12->vm_exit_reason = exit_reason | VMX_EXIT_REASONS_FAILED_VMENTRY;
 	vmcs12->exit_qualification = exit_qual;
 	if (enable_shadow_vmcs || vmx->nested.hv_evmcs)
-		vmx->nested.need_vmcs12_sync = true;
+		vmx->nested.need_vmcs12_to_shadow_sync = true;
 	return 1;
 }
 
@@ -3379,7 +3379,7 @@ static u32 vmx_get_preemption_timer_value(struct kvm_vcpu *vcpu)
  * VM-entry controls is also updated, since this is really a guest
  * state bit.)
  */
-static void sync_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
+static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 {
 	vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
 	vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
@@ -3858,14 +3858,14 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 		vcpu->arch.tsc_offset -= vmcs12->tsc_offset;
 
 	if (likely(!vmx->fail)) {
-		sync_vmcs12(vcpu, vmcs12);
+		sync_vmcs02_to_vmcs12(vcpu, vmcs12);
 
 		if (exit_reason != -1)
 			prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
 				       exit_qualification);
 
 		/*
-		 * Must happen outside of sync_vmcs12() as it will
+		 * Must happen outside of sync_vmcs02_to_vmcs12() as it will
 		 * also be used to capture vmcs12 cache as part of
 		 * capturing nVMX state for snapshot (migration).
 		 *
@@ -3921,7 +3921,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
 	if ((exit_reason != -1) && (enable_shadow_vmcs || vmx->nested.hv_evmcs))
-		vmx->nested.need_vmcs12_sync = true;
+		vmx->nested.need_vmcs12_to_shadow_sync = true;
 
 	/* in case we halted in L2 */
 	vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
@@ -4280,7 +4280,7 @@ static inline void nested_release_vmcs12(struct kvm_vcpu *vcpu)
 		/* copy to memory all shadowed fields in case
 		   they were modified */
 		copy_shadow_to_vmcs12(vmx);
-		vmx->nested.need_vmcs12_sync = false;
+		vmx->nested.need_vmcs12_to_shadow_sync = false;
 		vmx_disable_shadow_vmcs(vmx);
 	}
 	vmx->nested.posted_intr_nv = -1;
@@ -4545,7 +4545,7 @@ static void set_current_vmptr(struct vcpu_vmx *vmx, gpa_t vmptr)
 			      SECONDARY_EXEC_SHADOW_VMCS);
 		vmcs_write64(VMCS_LINK_POINTER,
 			     __pa(vmx->vmcs01.shadow_vmcs));
-		vmx->nested.need_vmcs12_sync = true;
+		vmx->nested.need_vmcs12_to_shadow_sync = true;
 	}
 	vmx->nested.dirty_vmcs12 = true;
 }
@@ -5296,12 +5296,12 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 	 * When running L2, the authoritative vmcs12 state is in the
 	 * vmcs02. When running L1, the authoritative vmcs12 state is
 	 * in the shadow or enlightened vmcs linked to vmcs01, unless
-	 * need_vmcs12_sync is set, in which case, the authoritative
+	 * need_vmcs12_to_shadow_sync is set, in which case, the authoritative
 	 * vmcs12 state is in the vmcs12 already.
 	 */
 	if (is_guest_mode(vcpu)) {
-		sync_vmcs12(vcpu, vmcs12);
-	} else if (!vmx->nested.need_vmcs12_sync) {
+		sync_vmcs02_to_vmcs12(vcpu, vmcs12);
+	} else if (!vmx->nested.need_vmcs12_to_shadow_sync) {
 		if (vmx->nested.hv_evmcs)
 			copy_enlightened_to_vmcs12(vmx);
 		else if (enable_shadow_vmcs)
@@ -5414,7 +5414,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 		 * Sync eVMCS upon entry as we may not have
 		 * HV_X64_MSR_VP_ASSIST_PAGE set up yet.
 		 */
-		vmx->nested.need_vmcs12_sync = true;
+		vmx->nested.need_vmcs12_to_shadow_sync = true;
 	} else {
 		return -EINVAL;
 	}
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index e847ff1019a2..8688262a15ed 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -17,7 +17,7 @@ int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry);
 bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason);
 void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 		       u32 exit_intr_info, unsigned long exit_qualification);
-void nested_sync_from_vmcs12(struct kvm_vcpu *vcpu);
+void nested_sync_vmcs12_to_shadow(struct kvm_vcpu *vcpu);
 int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
 int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
 int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 60306f19105d..91f55ff65345 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6392,8 +6392,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 		vmcs_write32(PLE_WINDOW, vmx->ple_window);
 	}
 
-	if (vmx->nested.need_vmcs12_sync)
-		nested_sync_from_vmcs12(vcpu);
+	if (vmx->nested.need_vmcs12_to_shadow_sync)
+		nested_sync_vmcs12_to_shadow(vcpu);
 
 	if (test_bit(VCPU_REGS_RSP, (unsigned long *)&vcpu->arch.regs_dirty))
 		vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 63d37ccce3dc..16210dde0374 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -113,7 +113,7 @@ struct nested_vmx {
 	 * Indicates if the shadow vmcs or enlightened vmcs must be updated
 	 * with the data held by struct vmcs12.
 	 */
-	bool need_vmcs12_sync;
+	bool need_vmcs12_to_shadow_sync;
 	bool dirty_vmcs12;
 
 	/*
-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 6/7] KVM: nVMX: Add helpers to identify shadowed VMCS fields
  2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
                   ` (4 preceding siblings ...)
  2019-05-07 15:36 ` [PATCH 5/7] KVM: nVMX: Use descriptive names for VMCS sync functions and flags Sean Christopherson
@ 2019-05-07 15:36 ` Sean Christopherson
  2019-05-07 15:36 ` [PATCH 7/7] KVM: nVMX: Sync rarely accessed guest fields only when needed Sean Christopherson
  6 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

So that future optimizations related to shadowed fields don't need to
define their own switch statement.

Add a BUILD_BUG_ON() to ensure at least one of the types (RW vs RO) is
defined when including vmcs_shadow_fields.h (guess who keeps mistyping
SHADOW_FIELD_RO as SHADOW_FIELD_R0).

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c             | 71 +++++++++++++++------------
 arch/x86/kvm/vmx/vmcs_shadow_fields.h |  4 ++
 2 files changed, 44 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 632e7e4324f3..279961a63db2 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4415,6 +4415,29 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
 	return nested_vmx_succeed(vcpu);
 }
 
+static bool is_shadow_field_rw(unsigned long field)
+{
+	switch (field) {
+#define SHADOW_FIELD_RW(x, y) case x:
+#include "vmcs_shadow_fields.h"
+		return true;
+	default:
+		break;
+	}
+	return false;
+}
+
+static bool is_shadow_field_ro(unsigned long field)
+{
+	switch (field) {
+#define SHADOW_FIELD_RO(x, y) case x:
+#include "vmcs_shadow_fields.h"
+		return true;
+	default:
+		break;
+	}
+	return false;
+}
 
 static int handle_vmwrite(struct kvm_vcpu *vcpu)
 {
@@ -4497,41 +4520,27 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 	vmcs12_write_any(vmcs12, field, offset, field_value);
 
 	/*
-	 * Do not track vmcs12 dirty-state if in guest-mode
-	 * as we actually dirty shadow vmcs12 instead of vmcs12.
+	 * Do not track vmcs12 dirty-state if in guest-mode as we actually
+	 * dirty shadow vmcs12 instead of vmcs12.  Fields that can be updated
+	 * by L1 without a vmexit are always updated in the vmcs02, i.e' don't
+	 * "dirty" vmcs12, all others go down the prepare_vmcs02() slow path.
 	 */
-	if (!is_guest_mode(vcpu)) {
-		switch (field) {
-#define SHADOW_FIELD_RW(x, y) case x:
-#include "vmcs_shadow_fields.h"
-			/*
-			 * The fields that can be updated by L1 without a vmexit are
-			 * always updated in the vmcs02, the others go down the slow
-			 * path of prepare_vmcs02.
-			 */
-			break;
+	if (!is_guest_mode(vcpu) && !is_shadow_field_rw(field)) {
+		/*
+		 * L1 can read these fields without exiting, ensure the
+		 * shadow VMCS is up-to-date.
+		 */
+		if (enable_shadow_vmcs && is_shadow_field_ro(field)) {
+			preempt_disable();
+			vmcs_load(vmx->vmcs01.shadow_vmcs);
 
-#define SHADOW_FIELD_RO(x, y) case x:
-#include "vmcs_shadow_fields.h"
-			/*
-			 * L1 can read these fields without exiting, ensure the
-			 * shadow VMCS is up-to-date.
-			 */
-			if (enable_shadow_vmcs) {
-				preempt_disable();
-				vmcs_load(vmx->vmcs01.shadow_vmcs);
+			__vmcs_writel(field, field_value);
 
-				__vmcs_writel(field, field_value);
-
-				vmcs_clear(vmx->vmcs01.shadow_vmcs);
-				vmcs_load(vmx->loaded_vmcs->vmcs);
-				preempt_enable();
-			}
-			/* fall through */
-		default:
-			vmx->nested.dirty_vmcs12 = true;
-			break;
+			vmcs_clear(vmx->vmcs01.shadow_vmcs);
+			vmcs_load(vmx->loaded_vmcs->vmcs);
+			preempt_enable();
 		}
+		vmx->nested.dirty_vmcs12 = true;
 	}
 
 	return nested_vmx_succeed(vcpu);
diff --git a/arch/x86/kvm/vmx/vmcs_shadow_fields.h b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
index 2cfa19ca158e..4cea018ba285 100644
--- a/arch/x86/kvm/vmx/vmcs_shadow_fields.h
+++ b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
@@ -1,3 +1,7 @@
+#if !defined(SHADOW_FIELD_RO) && !defined(SHADOW_FIELD_RW)
+BUILD_BUG_ON(1)
+#endif
+
 #ifndef SHADOW_FIELD_RO
 #define SHADOW_FIELD_RO(x, y)
 #endif
-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 7/7] KVM: nVMX: Sync rarely accessed guest fields only when needed
  2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
                   ` (5 preceding siblings ...)
  2019-05-07 15:36 ` [PATCH 6/7] KVM: nVMX: Add helpers to identify shadowed VMCS fields Sean Christopherson
@ 2019-05-07 15:36 ` Sean Christopherson
  2019-06-06 15:29   ` Paolo Bonzini
  6 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2019-05-07 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář; +Cc: kvm, Jim Mattson, Liran Alon

Many guest fields are rarely read (or written) by VMMs, i.e. likely
aren't accessed between runs of a nested VMCS.  Delay pulling rarely
accessed guest fields from vmcs02 until they are VMREAD or until vmcs12
is dirtied.  The latter case is necessary because nested VM-Entry will
consume all manner of fields when vmcs12 is dirty, e.g. for consistency
checks.

Note, an alternative to synchronizing all guest fields on VMREAD would
be to read *only* the field being accessed, but switching VMCS pointers
is expensive and odds are good if one guest field is being accessed then
others will soon follow, or that vmcs12 will be dirtied due to a VMWRITE
(see above).  And the full synchronization results in slightly cleaner
code.

Note, although GUEST_PDPTRs are relevant only for a 32-bit PAE guest,
they are accessed quite frequently for said guests, and a separate patch
is in flight to optimize away GUEST_PDTPR synchronziation for non-PAE
guests.

Skipping rarely accessed guest fields reduces the latency of a nested
VM-Exit by ~200 cycles.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 140 ++++++++++++++++++++++++++++++++------
 arch/x86/kvm/vmx/vmx.h    |   6 ++
 2 files changed, 125 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 279961a63db2..0ff85c88e2eb 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3373,21 +3373,56 @@ static u32 vmx_get_preemption_timer_value(struct kvm_vcpu *vcpu)
 	return value >> VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE;
 }
 
-/*
- * Update the guest state fields of vmcs12 to reflect changes that
- * occurred while L2 was running. (The "IA-32e mode guest" bit of the
- * VM-entry controls is also updated, since this is really a guest
- * state bit.)
- */
-static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
+static bool is_vmcs12_ext_field(unsigned long field)
 {
-	vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
-	vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
+	switch (field) {
+	case GUEST_ES_SELECTOR:
+	case GUEST_CS_SELECTOR:
+	case GUEST_SS_SELECTOR:
+	case GUEST_DS_SELECTOR:
+	case GUEST_FS_SELECTOR:
+	case GUEST_GS_SELECTOR:
+	case GUEST_LDTR_SELECTOR:
+	case GUEST_TR_SELECTOR:
+	case GUEST_ES_LIMIT:
+	case GUEST_CS_LIMIT:
+	case GUEST_SS_LIMIT:
+	case GUEST_DS_LIMIT:
+	case GUEST_FS_LIMIT:
+	case GUEST_GS_LIMIT:
+	case GUEST_LDTR_LIMIT:
+	case GUEST_TR_LIMIT:
+	case GUEST_GDTR_LIMIT:
+	case GUEST_IDTR_LIMIT:
+	case GUEST_ES_AR_BYTES:
+	case GUEST_DS_AR_BYTES:
+	case GUEST_FS_AR_BYTES:
+	case GUEST_GS_AR_BYTES:
+	case GUEST_LDTR_AR_BYTES:
+	case GUEST_TR_AR_BYTES:
+	case GUEST_ES_BASE:
+	case GUEST_CS_BASE:
+	case GUEST_SS_BASE:
+	case GUEST_DS_BASE:
+	case GUEST_FS_BASE:
+	case GUEST_GS_BASE:
+	case GUEST_LDTR_BASE:
+	case GUEST_TR_BASE:
+	case GUEST_GDTR_BASE:
+	case GUEST_IDTR_BASE:
+	case GUEST_PENDING_DBG_EXCEPTIONS:
+	case GUEST_BNDCFGS:
+		return true;
+	default:
+		break;
+	}
 
-	vmcs12->guest_rsp = kvm_rsp_read(vcpu);
-	vmcs12->guest_rip = kvm_rip_read(vcpu);
-	vmcs12->guest_rflags = vmcs_readl(GUEST_RFLAGS);
+	return false;
+}
 
+static void __sync_vmcs02_to_vmcs12_ext(struct kvm_vcpu *vcpu,
+					struct vmcs12 *vmcs12)
+{
 	vmcs12->guest_es_selector = vmcs_read16(GUEST_ES_SELECTOR);
 	vmcs12->guest_cs_selector = vmcs_read16(GUEST_CS_SELECTOR);
 	vmcs12->guest_ss_selector = vmcs_read16(GUEST_SS_SELECTOR);
@@ -3407,8 +3442,6 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	vmcs12->guest_gdtr_limit = vmcs_read32(GUEST_GDTR_LIMIT);
 	vmcs12->guest_idtr_limit = vmcs_read32(GUEST_IDTR_LIMIT);
 	vmcs12->guest_es_ar_bytes = vmcs_read32(GUEST_ES_AR_BYTES);
-	vmcs12->guest_cs_ar_bytes = vmcs_read32(GUEST_CS_AR_BYTES);
-	vmcs12->guest_ss_ar_bytes = vmcs_read32(GUEST_SS_AR_BYTES);
 	vmcs12->guest_ds_ar_bytes = vmcs_read32(GUEST_DS_AR_BYTES);
 	vmcs12->guest_fs_ar_bytes = vmcs_read32(GUEST_FS_AR_BYTES);
 	vmcs12->guest_gs_ar_bytes = vmcs_read32(GUEST_GS_AR_BYTES);
@@ -3424,11 +3457,65 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	vmcs12->guest_tr_base = vmcs_readl(GUEST_TR_BASE);
 	vmcs12->guest_gdtr_base = vmcs_readl(GUEST_GDTR_BASE);
 	vmcs12->guest_idtr_base = vmcs_readl(GUEST_IDTR_BASE);
-
-	vmcs12->guest_interruptibility_info =
-		vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
 	vmcs12->guest_pending_dbg_exceptions =
 		vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS);
+	if (kvm_mpx_supported())
+		vmcs12->guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
+}
+
+static void sync_vmcs02_to_vmcs12_ext(struct kvm_vcpu *vcpu,
+				      struct vmcs12 *vmcs12)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	int cpu;
+
+	if (!vmx->nested.need_vmcs02_to_vmcs12_ext_sync)
+		return;
+
+
+	WARN_ON_ONCE(vmx->loaded_vmcs != &vmx->vmcs01);
+
+	cpu = get_cpu();
+	vmx->loaded_vmcs = &vmx->nested.vmcs02;
+	vmx_vcpu_load(&vmx->vcpu, cpu);
+
+	__sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
+
+	vmx->loaded_vmcs = &vmx->vmcs01;
+	vmx_vcpu_load(&vmx->vcpu, cpu);
+	put_cpu();
+
+	vmx->nested.need_vmcs02_to_vmcs12_ext_sync = false;
+}
+
+/*
+ * Update the guest state fields of vmcs12 to reflect changes that
+ * occurred while L2 was running. (The "IA-32e mode guest" bit of the
+ * VM-entry controls is also updated, since this is really a guest
+ * state bit.)
+ */
+static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (vmx->nested.hv_evmcs)
+		__sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
+
+	vmx->nested.need_vmcs02_to_vmcs12_ext_sync = !vmx->nested.hv_evmcs;
+
+	vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
+	vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
+
+	vmcs12->guest_rsp = kvm_rsp_read(vcpu);
+	vmcs12->guest_rip = kvm_rip_read(vcpu);
+	vmcs12->guest_rflags = vmcs_readl(GUEST_RFLAGS);
+
+	vmcs12->guest_cs_ar_bytes = vmcs_read32(GUEST_CS_AR_BYTES);
+	vmcs12->guest_ss_ar_bytes = vmcs_read32(GUEST_SS_AR_BYTES);
+
+	vmcs12->guest_interruptibility_info =
+		vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
+
 	if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED)
 		vmcs12->guest_activity_state = GUEST_ACTIVITY_HLT;
 	else
@@ -3478,8 +3565,6 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	vmcs12->guest_sysenter_cs = vmcs_read32(GUEST_SYSENTER_CS);
 	vmcs12->guest_sysenter_esp = vmcs_readl(GUEST_SYSENTER_ESP);
 	vmcs12->guest_sysenter_eip = vmcs_readl(GUEST_SYSENTER_EIP);
-	if (kvm_mpx_supported())
-		vmcs12->guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
 }
 
 /*
@@ -4276,6 +4361,8 @@ static inline void nested_release_vmcs12(struct kvm_vcpu *vcpu)
 	if (vmx->nested.current_vmptr == -1ull)
 		return;
 
+	sync_vmcs02_to_vmcs12_ext(vcpu, get_vmcs12(vcpu));
+
 	if (enable_shadow_vmcs) {
 		/* copy to memory all shadowed fields in case
 		   they were modified */
@@ -4392,6 +4479,9 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
 		return nested_vmx_failValid(vcpu,
 			VMXERR_UNSUPPORTED_VMCS_COMPONENT);
 
+	if (!is_guest_mode(vcpu) && is_vmcs12_ext_field(field))
+		sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
+
 	/* Read the field, zero-extended to a u64 field_value */
 	field_value = vmcs12_read_any(vmcs12, field, offset);
 
@@ -4489,9 +4579,16 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
 		return nested_vmx_failValid(vcpu,
 			VMXERR_VMWRITE_READ_ONLY_VMCS_COMPONENT);
 
-	if (!is_guest_mode(vcpu))
+	if (!is_guest_mode(vcpu)) {
 		vmcs12 = get_vmcs12(vcpu);
-	else {
+
+		/*
+		 * Ensure vmcs12 is up-to-date before any VMWRITE that dirties
+		 * vmcs12, else we may crush a field or consume a stale value.
+		 */
+		if (!is_shadow_field_rw(field))
+			sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
+	} else {
 		/*
 		 * When vmcs->vmcs_link_pointer is -1ull, any VMWRITE
 		 * to shadowed-field sets the ALU flags for VMfailInvalid.
@@ -5310,6 +5407,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 	 */
 	if (is_guest_mode(vcpu)) {
 		sync_vmcs02_to_vmcs12(vcpu, vmcs12);
+		__sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
 	} else if (!vmx->nested.need_vmcs12_to_shadow_sync) {
 		if (vmx->nested.hv_evmcs)
 			copy_enlightened_to_vmcs12(vmx);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 16210dde0374..8b215c6840b4 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -123,6 +123,12 @@ struct nested_vmx {
 	 */
 	bool vmcs02_initialized;
 
+	/*
+	 * Indicates lazily loaded guest state has not yet been decached from
+	 * vmcs02.
+	 */
+	bool need_vmcs02_to_vmcs12_ext_sync;
+
 	bool change_vmcs01_virtual_apic_mode;
 
 	/*
-- 
2.21.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields
  2019-05-07 15:36 ` [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields Sean Christopherson
@ 2019-06-06 13:26   ` Paolo Bonzini
  2019-06-13 17:02   ` Jim Mattson
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2019-06-06 13:26 UTC (permalink / raw)
  To: Sean Christopherson, Radim Krčmář
  Cc: kvm, Jim Mattson, Liran Alon

On 07/05/19 17:36, Sean Christopherson wrote:
> Note, "writable" in this context means
> + * "writable by the guest", i.e. tagged SHADOW_FIELD_RW.  Note #2, the set of
> + * fields tagged SHADOW_FIELD_RO may or may not align with the "read-only"
> + * VM-exit information fields (which are actually writable if the vCPU is
> + * configured to support "VMWRITE to any supported field in the VMCS").


"There's a few more pages of P.S.'s here"
(https://youtu.be/rKlrTJ7WN18?t=170)

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/7] KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES
  2019-05-07 15:36 ` [PATCH 2/7] KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES Sean Christopherson
@ 2019-06-06 13:31   ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2019-06-06 13:31 UTC (permalink / raw)
  To: Sean Christopherson, Radim Krčmář
  Cc: kvm, Jim Mattson, Liran Alon

On 07/05/19 17:36, Sean Christopherson wrote:
> Since the AR_BYTES emulation is done only for intercepted VMWRITE, if a
> future patch (re)exposed AR_BYTES for both VMWRITE and VMREAD, then KVM
> would end up with incosistent behavior on pre-Haswell hardware, e.g. KVM
> would drop the reserved bits on intercepted VMWRITE, but direct VMWRITE
> to the shadow VMCS would not drop the bits.

Whoever gets that WARN will have probably a hard time finding again all of this, so:

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index cd51ef68434e..8c5614957e04 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -93,7 +93,7 @@ static void init_vmcs_shadow_fields(void)
 
 		WARN_ONCE(field >= GUEST_ES_AR_BYTES &&
 			  field <= GUEST_TR_AR_BYTES,
-			  "Update vmcs12_write_any() to expose AR_BYTES RW");
+			  "Update vmcs12_write_any() to drop reserved bits from AR_BYTES");
 
 		/*
 		 * PML and the preemption timer can be emulated, but the


Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 7/7] KVM: nVMX: Sync rarely accessed guest fields only when needed
  2019-05-07 15:36 ` [PATCH 7/7] KVM: nVMX: Sync rarely accessed guest fields only when needed Sean Christopherson
@ 2019-06-06 15:29   ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2019-06-06 15:29 UTC (permalink / raw)
  To: Sean Christopherson, Radim Krčmář
  Cc: kvm, Jim Mattson, Liran Alon

On 07/05/19 17:36, Sean Christopherson wrote:
> Many guest fields are rarely read (or written) by VMMs, i.e. likely
> aren't accessed between runs of a nested VMCS.  Delay pulling rarely
> accessed guest fields from vmcs02 until they are VMREAD or until vmcs12
> is dirtied.  The latter case is necessary because nested VM-Entry will
> consume all manner of fields when vmcs12 is dirty, e.g. for consistency
> checks.
> 
> Note, an alternative to synchronizing all guest fields on VMREAD would
> be to read *only* the field being accessed, but switching VMCS pointers
> is expensive and odds are good if one guest field is being accessed then
> others will soon follow, or that vmcs12 will be dirtied due to a VMWRITE
> (see above).  And the full synchronization results in slightly cleaner
> code.
> 
> Note, although GUEST_PDPTRs are relevant only for a 32-bit PAE guest,
> they are accessed quite frequently for said guests, and a separate patch
> is in flight to optimize away GUEST_PDTPR synchronziation for non-PAE
> guests.
> 
> Skipping rarely accessed guest fields reduces the latency of a nested
> VM-Exit by ~200 cycles.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

Just some naming improvements to be made here:

- __sync_vmcs02_to_vmcs12_ext -> sync_vmcs02_to_vmcs12_extra

- sync_vmcs02_to_vmcs12_ext -> copy_vmcs02_to_vmcs12_extra (copy functions
take care of vmptrld/vmclear; sync functions don't!)

- need_vmcs02_to_vmcs12_ext_sync -> need_sync_vmcs02_to_vmcs12_extra

and with this change, this follow-up becomes obvious:

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index fd8150ef6cce..b3249d071202 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3464,6 +3464,8 @@ static void __sync_vmcs02_to_vmcs12_ext(struct kvm_vcpu *vcpu,
 		vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS);
 	if (kvm_mpx_supported())
 		vmcs12->guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
+
+	vmx->nested.need_sync_vmcs02_to_vmcs12_extra = false;
 }
 
 static void sync_vmcs02_to_vmcs12_ext(struct kvm_vcpu *vcpu,
@@ -3487,8 +3489,6 @@ static void sync_vmcs02_to_vmcs12_ext(struct kvm_vcpu *vcpu,
 	vmx->loaded_vmcs = &vmx->vmcs01;
 	vmx_vcpu_load(&vmx->vcpu, cpu);
 	put_cpu();
-
-	vmx->nested.need_vmcs02_to_vmcs12_ext_sync = false;
 }
 
 /*

Paolo

> ---
>  arch/x86/kvm/vmx/nested.c | 140 ++++++++++++++++++++++++++++++++------
>  arch/x86/kvm/vmx/vmx.h    |   6 ++
>  2 files changed, 125 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 279961a63db2..0ff85c88e2eb 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3373,21 +3373,56 @@ static u32 vmx_get_preemption_timer_value(struct kvm_vcpu *vcpu)
>  	return value >> VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE;
>  }
>  
> -/*
> - * Update the guest state fields of vmcs12 to reflect changes that
> - * occurred while L2 was running. (The "IA-32e mode guest" bit of the
> - * VM-entry controls is also updated, since this is really a guest
> - * state bit.)
> - */
> -static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> +static bool is_vmcs12_ext_field(unsigned long field)
>  {
> -	vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
> -	vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
> +	switch (field) {
> +	case GUEST_ES_SELECTOR:
> +	case GUEST_CS_SELECTOR:
> +	case GUEST_SS_SELECTOR:
> +	case GUEST_DS_SELECTOR:
> +	case GUEST_FS_SELECTOR:
> +	case GUEST_GS_SELECTOR:
> +	case GUEST_LDTR_SELECTOR:
> +	case GUEST_TR_SELECTOR:
> +	case GUEST_ES_LIMIT:
> +	case GUEST_CS_LIMIT:
> +	case GUEST_SS_LIMIT:
> +	case GUEST_DS_LIMIT:
> +	case GUEST_FS_LIMIT:
> +	case GUEST_GS_LIMIT:
> +	case GUEST_LDTR_LIMIT:
> +	case GUEST_TR_LIMIT:
> +	case GUEST_GDTR_LIMIT:
> +	case GUEST_IDTR_LIMIT:
> +	case GUEST_ES_AR_BYTES:
> +	case GUEST_DS_AR_BYTES:
> +	case GUEST_FS_AR_BYTES:
> +	case GUEST_GS_AR_BYTES:
> +	case GUEST_LDTR_AR_BYTES:
> +	case GUEST_TR_AR_BYTES:
> +	case GUEST_ES_BASE:
> +	case GUEST_CS_BASE:
> +	case GUEST_SS_BASE:
> +	case GUEST_DS_BASE:
> +	case GUEST_FS_BASE:
> +	case GUEST_GS_BASE:
> +	case GUEST_LDTR_BASE:
> +	case GUEST_TR_BASE:
> +	case GUEST_GDTR_BASE:
> +	case GUEST_IDTR_BASE:
> +	case GUEST_PENDING_DBG_EXCEPTIONS:
> +	case GUEST_BNDCFGS:
> +		return true;
> +	default:
> +		break;
> +	}
>  
> -	vmcs12->guest_rsp = kvm_rsp_read(vcpu);
> -	vmcs12->guest_rip = kvm_rip_read(vcpu);
> -	vmcs12->guest_rflags = vmcs_readl(GUEST_RFLAGS);
> +	return false;
> +}
>  
> +static void __sync_vmcs02_to_vmcs12_ext(struct kvm_vcpu *vcpu,
> +					struct vmcs12 *vmcs12)
> +{
>  	vmcs12->guest_es_selector = vmcs_read16(GUEST_ES_SELECTOR);
>  	vmcs12->guest_cs_selector = vmcs_read16(GUEST_CS_SELECTOR);
>  	vmcs12->guest_ss_selector = vmcs_read16(GUEST_SS_SELECTOR);
> @@ -3407,8 +3442,6 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  	vmcs12->guest_gdtr_limit = vmcs_read32(GUEST_GDTR_LIMIT);
>  	vmcs12->guest_idtr_limit = vmcs_read32(GUEST_IDTR_LIMIT);
>  	vmcs12->guest_es_ar_bytes = vmcs_read32(GUEST_ES_AR_BYTES);
> -	vmcs12->guest_cs_ar_bytes = vmcs_read32(GUEST_CS_AR_BYTES);
> -	vmcs12->guest_ss_ar_bytes = vmcs_read32(GUEST_SS_AR_BYTES);
>  	vmcs12->guest_ds_ar_bytes = vmcs_read32(GUEST_DS_AR_BYTES);
>  	vmcs12->guest_fs_ar_bytes = vmcs_read32(GUEST_FS_AR_BYTES);
>  	vmcs12->guest_gs_ar_bytes = vmcs_read32(GUEST_GS_AR_BYTES);
> @@ -3424,11 +3457,65 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  	vmcs12->guest_tr_base = vmcs_readl(GUEST_TR_BASE);
>  	vmcs12->guest_gdtr_base = vmcs_readl(GUEST_GDTR_BASE);
>  	vmcs12->guest_idtr_base = vmcs_readl(GUEST_IDTR_BASE);
> -
> -	vmcs12->guest_interruptibility_info =
> -		vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
>  	vmcs12->guest_pending_dbg_exceptions =
>  		vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS);
> +	if (kvm_mpx_supported())
> +		vmcs12->guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
> +}
> +
> +static void sync_vmcs02_to_vmcs12_ext(struct kvm_vcpu *vcpu,
> +				      struct vmcs12 *vmcs12)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	int cpu;
> +
> +	if (!vmx->nested.need_vmcs02_to_vmcs12_ext_sync)
> +		return;
> +
> +
> +	WARN_ON_ONCE(vmx->loaded_vmcs != &vmx->vmcs01);
> +
> +	cpu = get_cpu();
> +	vmx->loaded_vmcs = &vmx->nested.vmcs02;
> +	vmx_vcpu_load(&vmx->vcpu, cpu);
> +
> +	__sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
> +
> +	vmx->loaded_vmcs = &vmx->vmcs01;
> +	vmx_vcpu_load(&vmx->vcpu, cpu);
> +	put_cpu();
> +
> +	vmx->nested.need_vmcs02_to_vmcs12_ext_sync = false;
> +}
> +
> +/*
> + * Update the guest state fields of vmcs12 to reflect changes that
> + * occurred while L2 was running. (The "IA-32e mode guest" bit of the
> + * VM-entry controls is also updated, since this is really a guest
> + * state bit.)
> + */
> +static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +	if (vmx->nested.hv_evmcs)
> +		__sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
> +
> +	vmx->nested.need_vmcs02_to_vmcs12_ext_sync = !vmx->nested.hv_evmcs;
> +
> +	vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
> +	vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
> +
> +	vmcs12->guest_rsp = kvm_rsp_read(vcpu);
> +	vmcs12->guest_rip = kvm_rip_read(vcpu);
> +	vmcs12->guest_rflags = vmcs_readl(GUEST_RFLAGS);
> +
> +	vmcs12->guest_cs_ar_bytes = vmcs_read32(GUEST_CS_AR_BYTES);
> +	vmcs12->guest_ss_ar_bytes = vmcs_read32(GUEST_SS_AR_BYTES);
> +
> +	vmcs12->guest_interruptibility_info =
> +		vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
> +
>  	if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED)
>  		vmcs12->guest_activity_state = GUEST_ACTIVITY_HLT;
>  	else
> @@ -3478,8 +3565,6 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  	vmcs12->guest_sysenter_cs = vmcs_read32(GUEST_SYSENTER_CS);
>  	vmcs12->guest_sysenter_esp = vmcs_readl(GUEST_SYSENTER_ESP);
>  	vmcs12->guest_sysenter_eip = vmcs_readl(GUEST_SYSENTER_EIP);
> -	if (kvm_mpx_supported())
> -		vmcs12->guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
>  }
>  
>  /*
> @@ -4276,6 +4361,8 @@ static inline void nested_release_vmcs12(struct kvm_vcpu *vcpu)
>  	if (vmx->nested.current_vmptr == -1ull)
>  		return;
>  
> +	sync_vmcs02_to_vmcs12_ext(vcpu, get_vmcs12(vcpu));
> +
>  	if (enable_shadow_vmcs) {
>  		/* copy to memory all shadowed fields in case
>  		   they were modified */
> @@ -4392,6 +4479,9 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
>  		return nested_vmx_failValid(vcpu,
>  			VMXERR_UNSUPPORTED_VMCS_COMPONENT);
>  
> +	if (!is_guest_mode(vcpu) && is_vmcs12_ext_field(field))
> +		sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
> +
>  	/* Read the field, zero-extended to a u64 field_value */
>  	field_value = vmcs12_read_any(vmcs12, field, offset);
>  
> @@ -4489,9 +4579,16 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
>  		return nested_vmx_failValid(vcpu,
>  			VMXERR_VMWRITE_READ_ONLY_VMCS_COMPONENT);
>  
> -	if (!is_guest_mode(vcpu))
> +	if (!is_guest_mode(vcpu)) {
>  		vmcs12 = get_vmcs12(vcpu);
> -	else {
> +
> +		/*
> +		 * Ensure vmcs12 is up-to-date before any VMWRITE that dirties
> +		 * vmcs12, else we may crush a field or consume a stale value.
> +		 */
> +		if (!is_shadow_field_rw(field))
> +			sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
> +	} else {
>  		/*
>  		 * When vmcs->vmcs_link_pointer is -1ull, any VMWRITE
>  		 * to shadowed-field sets the ALU flags for VMfailInvalid.
> @@ -5310,6 +5407,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
>  	 */
>  	if (is_guest_mode(vcpu)) {
>  		sync_vmcs02_to_vmcs12(vcpu, vmcs12);
> +		__sync_vmcs02_to_vmcs12_ext(vcpu, vmcs12);
>  	} else if (!vmx->nested.need_vmcs12_to_shadow_sync) {
>  		if (vmx->nested.hv_evmcs)
>  			copy_enlightened_to_vmcs12(vmx);
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 16210dde0374..8b215c6840b4 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -123,6 +123,12 @@ struct nested_vmx {
>  	 */
>  	bool vmcs02_initialized;
>  
> +	/*
> +	 * Indicates lazily loaded guest state has not yet been decached from
> +	 * vmcs02.
> +	 */
> +	bool need_vmcs02_to_vmcs12_ext_sync;
> +
>  	bool change_vmcs01_virtual_apic_mode;
>  
>  	/*
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields
  2019-05-07 15:36 ` [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields Sean Christopherson
  2019-06-06 13:26   ` Paolo Bonzini
@ 2019-06-13 17:02   ` Jim Mattson
  2019-06-13 17:18     ` Paolo Bonzini
  1 sibling, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2019-06-13 17:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Radim Krčmář, kvm list, Liran Alon

On Tue, May 7, 2019 at 8:36 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:

> Not intercepting fields tagged read-only also allows for additional
> optimizations, e.g. marking GUEST_{CS,SS}_AR_BYTES as SHADOW_FIELD_RO
> since those fields are rarely written by a VMMs, but read frequently.

Do you have data to support this, or is this just a gut feeling? The
last time I looked at Virtual Box (which was admittedly a long time
ago), it liked to read and write just about every VMCS guest-state
field it could find on every VM-exit.

The decision of which fields to shadow is really something that should
be done dynamically, depending on the behavior of the guest hypervisor
(which may vary depending on the L2 guest it's running!) Making the
decision statically is bound to result in a poor outcome for some
scenarios.

When I measured this several years ago, taking one VM-exit for a
VMREAD or VMWRITE was more expensive than needlessly shadowing it
~35-40 times.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields
  2019-06-13 17:02   ` Jim Mattson
@ 2019-06-13 17:18     ` Paolo Bonzini
  2019-06-13 17:36       ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2019-06-13 17:18 UTC (permalink / raw)
  To: Jim Mattson, Sean Christopherson
  Cc: Radim Krčmář, kvm list, Liran Alon

On 13/06/19 19:02, Jim Mattson wrote:
> On Tue, May 7, 2019 at 8:36 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> 
>> Not intercepting fields tagged read-only also allows for additional
>> optimizations, e.g. marking GUEST_{CS,SS}_AR_BYTES as SHADOW_FIELD_RO
>> since those fields are rarely written by a VMMs, but read frequently.
> 
> Do you have data to support this, or is this just a gut feeling? The
> last time I looked at Virtual Box (which was admittedly a long time
> ago), it liked to read and write just about every VMCS guest-state
> field it could find on every VM-exit.

I have never looked at VirtualBox, but most other hypervisors do have a
common set of fields (give or take a couple) that they like to read
and/or write on most if not every vmexit.

Also, while this may vary dynamically based on the L2 guest that is
running, this is much less true for unrestricted-guest processors.
Without data on _which_ scenarios are bad for a static set of shadowed
fields, I'm not really happy to add even more complexity.

Paolo

> The decision of which fields to shadow is really something that should
> be done dynamically, depending on the behavior of the guest hypervisor
> (which may vary depending on the L2 guest it's running!) Making the
> decision statically is bound to result in a poor outcome for some
> scenarios.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields
  2019-06-13 17:18     ` Paolo Bonzini
@ 2019-06-13 17:36       ` Jim Mattson
  2019-06-13 17:59         ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2019-06-13 17:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Radim Krčmář, kvm list, Liran Alon

On Thu, Jun 13, 2019 at 10:18 AM Paolo Bonzini <pbonzini@redhat.com> wrote:

> Also, while this may vary dynamically based on the L2 guest that is
> running, this is much less true for unrestricted-guest processors.
> Without data on _which_ scenarios are bad for a static set of shadowed
> fields, I'm not really happy to add even more complexity.

Data supporting which scenarios would lead you to entertain more
complexity? Is it even worth collecting data on L3 performance, for
example? :-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields
  2019-06-13 17:36       ` Jim Mattson
@ 2019-06-13 17:59         ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2019-06-13 17:59 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Radim Krčmář, kvm list, Liran Alon

On 13/06/19 19:36, Jim Mattson wrote:
>> Also, while this may vary dynamically based on the L2 guest that is
>> running, this is much less true for unrestricted-guest processors.
>> Without data on _which_ scenarios are bad for a static set of shadowed
>> fields, I'm not really happy to add even more complexity.
>
> Data supporting which scenarios would lead you to entertain more
> complexity?

For example it would be interesting if some L1 hypervisor had 2x slower
vmexits on some L2 guests, but otherwise fits the current set of
shadowed fields.

Paolo

> Is it even worth collecting data on L3 performance, for
> example? :-)


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, back to index

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-07 15:36 [PATCH 0/7] KVM: nVMX: Optimize VMCS data copying Sean Christopherson
2019-05-07 15:36 ` [PATCH 1/7] KVM: nVMX: Intercept VMWRITEs to read-only shadow VMCS fields Sean Christopherson
2019-06-06 13:26   ` Paolo Bonzini
2019-06-13 17:02   ` Jim Mattson
2019-06-13 17:18     ` Paolo Bonzini
2019-06-13 17:36       ` Jim Mattson
2019-06-13 17:59         ` Paolo Bonzini
2019-05-07 15:36 ` [PATCH 2/7] KVM: nVMX: Intercept VMWRITEs to GUEST_{CS,SS}_AR_BYTES Sean Christopherson
2019-06-06 13:31   ` Paolo Bonzini
2019-05-07 15:36 ` [PATCH 3/7] KVM: nVMX: Track vmcs12 offsets for shadowed VMCS fields Sean Christopherson
2019-05-07 15:36 ` [PATCH 4/7] KVM: nVMX: Lift sync_vmcs12() out of prepare_vmcs12() Sean Christopherson
2019-05-07 15:36 ` [PATCH 5/7] KVM: nVMX: Use descriptive names for VMCS sync functions and flags Sean Christopherson
2019-05-07 15:36 ` [PATCH 6/7] KVM: nVMX: Add helpers to identify shadowed VMCS fields Sean Christopherson
2019-05-07 15:36 ` [PATCH 7/7] KVM: nVMX: Sync rarely accessed guest fields only when needed Sean Christopherson
2019-06-06 15:29   ` Paolo Bonzini

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org kvm@archiver.kernel.org
	public-inbox-index kvm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox