[PATCH v2 0/4] Add support for capturing the highest observable L2 TSC

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/4] Add support for capturing the highest observable L2 TSC
@ 2019-11-05 19:19 Aaron Lewis
  2019-11-05 19:19 ` [PATCH v2 1/4] kvm: nested: Introduce read_and_check_msr_entry() Aaron Lewis
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Aaron Lewis @ 2019-11-05 19:19 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Jim Mattson, Aaron Lewis

The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
TSC value that might have been observed by L2 prior to VM-exit. The
current implementation does not capture a very tight bound on this
value.  To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
MSR-store area.  When L0 processes the vmcs12 VM-exit MSR-store area
during the emulation of an L2->L1 VM-exit, special-case the
IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
VM-exit MSR-store area to derive the value to be stored in the vmcs12
VM-exit MSR-store area.

v1 -> v2:
 - Rename function nested_vmx_get_msr_value() to
   nested_vmx_get_vmexit_msr_value().
 - Remove unneeded tag 'Change-Id' from commit messages.

Aaron Lewis (4):
  kvm: nested: Introduce read_and_check_msr_entry()
  kvm: vmx: Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES
  kvm: vmx: Rename function find_msr() to vmx_find_msr_index()
  KVM: nVMX: Add support for capturing highest observable L2 TSC

 arch/x86/kvm/vmx/nested.c | 127 ++++++++++++++++++++++++++++++++------
 arch/x86/kvm/vmx/vmx.c    |  14 ++---
 arch/x86/kvm/vmx/vmx.h    |   9 ++-
 3 files changed, 122 insertions(+), 28 deletions(-)

--
2.24.0.rc1.363.gb1bccd3e3d-goog

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/4] kvm: nested: Introduce read_and_check_msr_entry()
  2019-11-05 19:19 [PATCH v2 0/4] Add support for capturing the highest observable L2 TSC Aaron Lewis
@ 2019-11-05 19:19 ` Aaron Lewis
  2019-11-05 21:27   ` Liran Alon
  2019-11-05 19:19 ` [PATCH v2 2/4] kvm: vmx: Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES Aaron Lewis
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Aaron Lewis @ 2019-11-05 19:19 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Jim Mattson, Aaron Lewis

Add the function read_and_check_msr_entry() which just pulls some code
out of nested_vmx_store_msr() for now, however, this is in preparation
for a change later in this series were we reuse the code in
read_and_check_msr_entry().

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 arch/x86/kvm/vmx/nested.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index e76eb4f07f6c..7b058d7b9fcc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -929,6 +929,26 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
 	return i + 1;
 }

+static bool read_and_check_msr_entry(struct kvm_vcpu *vcpu, u64 gpa, int i,
+				     struct vmx_msr_entry *e)
+{
+	if (kvm_vcpu_read_guest(vcpu,
+				gpa + i * sizeof(*e),
+				e, 2 * sizeof(u32))) {
+		pr_debug_ratelimited(
+			"%s cannot read MSR entry (%u, 0x%08llx)\n",
+			__func__, i, gpa + i * sizeof(*e));
+		return false;
+	}
+	if (nested_vmx_store_msr_check(vcpu, e)) {
+		pr_debug_ratelimited(
+			"%s check failed (%u, 0x%x, 0x%x)\n",
+			__func__, i, e->index, e->reserved);
+		return false;
+	}
+	return true;
+}
+
 static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
 {
 	u64 data;
@@ -940,20 +960,9 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
 		if (unlikely(i >= max_msr_list_size))
 			return -EINVAL;

-		if (kvm_vcpu_read_guest(vcpu,
-					gpa + i * sizeof(e),
-					&e, 2 * sizeof(u32))) {
-			pr_debug_ratelimited(
-				"%s cannot read MSR entry (%u, 0x%08llx)\n",
-				__func__, i, gpa + i * sizeof(e));
+		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
 			return -EINVAL;
-		}
-		if (nested_vmx_store_msr_check(vcpu, &e)) {
-			pr_debug_ratelimited(
-				"%s check failed (%u, 0x%x, 0x%x)\n",
-				__func__, i, e.index, e.reserved);
-			return -EINVAL;
-		}
+
 		if (kvm_get_msr(vcpu, e.index, &data)) {
 			pr_debug_ratelimited(
 				"%s cannot read MSR (%u, 0x%x)\n",
--
2.24.0.rc1.363.gb1bccd3e3d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/4] kvm: vmx: Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES
  2019-11-05 19:19 [PATCH v2 0/4] Add support for capturing the highest observable L2 TSC Aaron Lewis
  2019-11-05 19:19 ` [PATCH v2 1/4] kvm: nested: Introduce read_and_check_msr_entry() Aaron Lewis
@ 2019-11-05 19:19 ` Aaron Lewis
  2019-11-05 21:28   ` Liran Alon
  2019-11-05 19:19 ` [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index() Aaron Lewis
  2019-11-05 19:19 ` [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC Aaron Lewis
  3 siblings, 1 reply; 15+ messages in thread
From: Aaron Lewis @ 2019-11-05 19:19 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Jim Mattson, Aaron Lewis

Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES.  This needs to be done
due to the addition of the MSR-autostore area that will be added later
in this series.  After that the name AUTOLOAD will no longer make sense.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++--
 arch/x86/kvm/vmx/vmx.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e7970a2e8eae..c0160ca9ddba 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -940,8 +940,8 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
 	if (!entry_only)
 		j = find_msr(&m->host, msr);

-	if ((i < 0 && m->guest.nr == NR_AUTOLOAD_MSRS) ||
-		(j < 0 &&  m->host.nr == NR_AUTOLOAD_MSRS)) {
+	if ((i < 0 && m->guest.nr == NR_MSR_ENTRIES) ||
+		(j < 0 &&  m->host.nr == NR_MSR_ENTRIES)) {
 		printk_once(KERN_WARNING "Not enough msr switch entries. "
 				"Can't add msr %x\n", msr);
 		return;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index bee16687dc0b..0c6835bd6945 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -22,11 +22,11 @@ extern u32 get_umwait_control_msr(void);

 #define X2APIC_MSR(r) (APIC_BASE_MSR + ((r) >> 4))

-#define NR_AUTOLOAD_MSRS 8
+#define NR_MSR_ENTRIES 8

 struct vmx_msrs {
 	unsigned int		nr;
-	struct vmx_msr_entry	val[NR_AUTOLOAD_MSRS];
+	struct vmx_msr_entry	val[NR_MSR_ENTRIES];
 };

 struct shared_msr_entry {
--
2.24.0.rc1.363.gb1bccd3e3d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index()
  2019-11-05 19:19 [PATCH v2 0/4] Add support for capturing the highest observable L2 TSC Aaron Lewis
  2019-11-05 19:19 ` [PATCH v2 1/4] kvm: nested: Introduce read_and_check_msr_entry() Aaron Lewis
  2019-11-05 19:19 ` [PATCH v2 2/4] kvm: vmx: Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES Aaron Lewis
@ 2019-11-05 19:19 ` Aaron Lewis
  2019-11-05 21:31   ` Liran Alon
  2019-11-05 19:19 ` [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC Aaron Lewis
  3 siblings, 1 reply; 15+ messages in thread
From: Aaron Lewis @ 2019-11-05 19:19 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Jim Mattson, Aaron Lewis

Rename function find_msr() to vmx_find_msr_index() to share
implementations between vmx.c and nested.c in an upcoming change.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 10 +++++-----
 arch/x86/kvm/vmx/vmx.h |  1 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c0160ca9ddba..39c701730297 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -835,7 +835,7 @@ static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
 	vm_exit_controls_clearbit(vmx, exit);
 }

-static int find_msr(struct vmx_msrs *m, unsigned int msr)
+int vmx_find_msr_index(struct vmx_msrs *m, u32 msr)
 {
 	unsigned int i;

@@ -869,7 +869,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
 		}
 		break;
 	}
-	i = find_msr(&m->guest, msr);
+	i = vmx_find_msr_index(&m->guest, msr);
 	if (i < 0)
 		goto skip_guest;
 	--m->guest.nr;
@@ -877,7 +877,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr);

 skip_guest:
-	i = find_msr(&m->host, msr);
+	i = vmx_find_msr_index(&m->host, msr);
 	if (i < 0)
 		return;

@@ -936,9 +936,9 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
 		wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
 	}

-	i = find_msr(&m->guest, msr);
+	i = vmx_find_msr_index(&m->guest, msr);
 	if (!entry_only)
-		j = find_msr(&m->host, msr);
+		j = vmx_find_msr_index(&m->host, msr);

 	if ((i < 0 && m->guest.nr == NR_MSR_ENTRIES) ||
 		(j < 0 &&  m->host.nr == NR_MSR_ENTRIES)) {
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 0c6835bd6945..34b5fef603d8 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -334,6 +334,7 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
 struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr);
 void pt_update_intercept_for_msr(struct vcpu_vmx *vmx);
 void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
+int vmx_find_msr_index(struct vmx_msrs *m, u32 msr);

 #define POSTED_INTR_ON  0
 #define POSTED_INTR_SN  1
--
2.24.0.rc1.363.gb1bccd3e3d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC
  2019-11-05 19:19 [PATCH v2 0/4] Add support for capturing the highest observable L2 TSC Aaron Lewis
                   ` (2 preceding siblings ...)
  2019-11-05 19:19 ` [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index() Aaron Lewis
@ 2019-11-05 19:19 ` Aaron Lewis
  2019-11-05 21:58   ` Liran Alon
  2019-11-05 22:48   ` Sean Christopherson
  3 siblings, 2 replies; 15+ messages in thread
From: Aaron Lewis @ 2019-11-05 19:19 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Jim Mattson, Aaron Lewis

The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
TSC value that might have been observed by L2 prior to VM-exit. The
current implementation does not capture a very tight bound on this
value.  To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
MSR-store area.  When L0 processes the vmcs12 VM-exit MSR-store area
during the emulation of an L2->L1 VM-exit, special-case the
IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
VM-exit MSR-store area to derive the value to be stored in the vmcs12
VM-exit MSR-store area.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 arch/x86/kvm/vmx/nested.c | 92 ++++++++++++++++++++++++++++++++++++---
 arch/x86/kvm/vmx/vmx.h    |  4 ++
 2 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 7b058d7b9fcc..cb2a92341eab 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -929,6 +929,37 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
 	return i + 1;
 }

+static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu,
+					    u32 msr_index,
+					    u64 *data)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	/*
+	 * If the L0 hypervisor stored a more accurate value for the TSC that
+	 * does not include the time taken for emulation of the L2->L1
+	 * VM-exit in L0, use the more accurate value.
+	 */
+	if (msr_index == MSR_IA32_TSC) {
+		int index = vmx_find_msr_index(&vmx->msr_autostore.guest,
+					       MSR_IA32_TSC);
+
+		if (index >= 0) {
+			u64 val = vmx->msr_autostore.guest.val[index].value;
+
+			*data = kvm_read_l1_tsc(vcpu, val);
+			return true;
+		}
+	}
+
+	if (kvm_get_msr(vcpu, msr_index, data)) {
+		pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__,
+			msr_index);
+		return false;
+	}
+	return true;
+}
+
 static bool read_and_check_msr_entry(struct kvm_vcpu *vcpu, u64 gpa, int i,
 				     struct vmx_msr_entry *e)
 {
@@ -963,12 +994,9 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
 		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
 			return -EINVAL;

-		if (kvm_get_msr(vcpu, e.index, &data)) {
-			pr_debug_ratelimited(
-				"%s cannot read MSR (%u, 0x%x)\n",
-				__func__, i, e.index);
+		if (!nested_vmx_get_vmexit_msr_value(vcpu, e.index, &data))
 			return -EINVAL;
-		}
+
 		if (kvm_vcpu_write_guest(vcpu,
 					 gpa + i * sizeof(e) +
 					     offsetof(struct vmx_msr_entry, value),
@@ -982,6 +1010,51 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
 	return 0;
 }

+static bool nested_msr_store_list_has_msr(struct kvm_vcpu *vcpu, u32 msr_index)
+{
+	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+	u32 count = vmcs12->vm_exit_msr_store_count;
+	u64 gpa = vmcs12->vm_exit_msr_store_addr;
+	struct vmx_msr_entry e;
+	u32 i;
+
+	for (i = 0; i < count; i++) {
+		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
+			return false;
+
+		if (e.index == msr_index)
+			return true;
+	}
+	return false;
+}
+
+static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu,
+					   u32 msr_index)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	struct vmx_msrs *autostore = &vmx->msr_autostore.guest;
+	int i = vmx_find_msr_index(autostore, msr_index);
+	bool in_autostore_list = i >= 0;
+	bool in_vmcs12_store_list;
+	int last;
+
+	in_vmcs12_store_list = nested_msr_store_list_has_msr(vcpu, msr_index);
+
+	if (in_vmcs12_store_list && !in_autostore_list) {
+		if (autostore->nr == NR_MSR_ENTRIES) {
+			pr_warn_ratelimited(
+				"Not enough msr entries in msr_autostore.  Can't add msr %x\n",
+				msr_index);
+			return;
+		}
+		last = autostore->nr++;
+		autostore->val[last].index = msr_index;
+	} else if (!in_vmcs12_store_list && in_autostore_list) {
+		last = --autostore->nr;
+		autostore->val[i] = autostore->val[last];
+	}
+}
+
 static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
 {
 	unsigned long invalid_mask;
@@ -2027,7 +2100,7 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
 	 * addresses are constant (for vmcs02), the counts can change based
 	 * on L2's behavior, e.g. switching to/from long mode.
 	 */
-	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
+	vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.guest.val));
 	vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val));
 	vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val));

@@ -2294,6 +2367,13 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
 		vmcs_write64(EOI_EXIT_BITMAP3, vmcs12->eoi_exit_bitmap3);
 	}

+	/*
+	 * Make sure the msr_autostore list is up to date before we set the
+	 * count in the vmcs02.
+	 */
+	prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC);
+
+	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.guest.nr);
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 34b5fef603d8..0ab1562287af 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -230,6 +230,10 @@ struct vcpu_vmx {
 		struct vmx_msrs host;
 	} msr_autoload;

+	struct msr_autostore {
+		struct vmx_msrs guest;
+	} msr_autostore;
+
 	struct {
 		int vm86_active;
 		ulong save_rflags;
--
2.24.0.rc1.363.gb1bccd3e3d-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/4] kvm: nested: Introduce read_and_check_msr_entry()
  2019-11-05 19:19 ` [PATCH v2 1/4] kvm: nested: Introduce read_and_check_msr_entry() Aaron Lewis
@ 2019-11-05 21:27   ` Liran Alon
  0 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2019-11-05 21:27 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: kvm, Paolo Bonzini, Jim Mattson



> On 5 Nov 2019, at 21:19, Aaron Lewis <aaronlewis@google.com> wrote:
> 
> Add the function read_and_check_msr_entry() which just pulls some code
> out of nested_vmx_store_msr() for now, however, this is in preparation
> for a change later in this series were we reuse the code in
> read_and_check_msr_entry().

Please don’t refer to “this series” in commit message.
As once this patch series will be merged, “this series” will be meaning-less.

Prefer to just explain what is the change that will be introduced in future commits that requires this change.
E.g. “This new utility function will be used by upcoming patches that will introduce code which search vmcs12->vm_exit_msr_store for specific entry."

> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>

Reviewed-by: Liran Alon <liran.alon@oracle.com>

> ---
> arch/x86/kvm/vmx/nested.c | 35 ++++++++++++++++++++++-------------
> 1 file changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index e76eb4f07f6c..7b058d7b9fcc 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -929,6 +929,26 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> 	return i + 1;
> }
> 
> +static bool read_and_check_msr_entry(struct kvm_vcpu *vcpu, u64 gpa, int i,
> +				     struct vmx_msr_entry *e)
> +{
> +	if (kvm_vcpu_read_guest(vcpu,
> +				gpa + i * sizeof(*e),
> +				e, 2 * sizeof(u32))) {
> +		pr_debug_ratelimited(
> +			"%s cannot read MSR entry (%u, 0x%08llx)\n",
> +			__func__, i, gpa + i * sizeof(*e));
> +		return false;
> +	}
> +	if (nested_vmx_store_msr_check(vcpu, e)) {
> +		pr_debug_ratelimited(
> +			"%s check failed (%u, 0x%x, 0x%x)\n",
> +			__func__, i, e->index, e->reserved);
> +		return false;
> +	}
> +	return true;
> +}
> +
> static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> {
> 	u64 data;
> @@ -940,20 +960,9 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> 		if (unlikely(i >= max_msr_list_size))
> 			return -EINVAL;
> 
> -		if (kvm_vcpu_read_guest(vcpu,
> -					gpa + i * sizeof(e),
> -					&e, 2 * sizeof(u32))) {
> -			pr_debug_ratelimited(
> -				"%s cannot read MSR entry (%u, 0x%08llx)\n",
> -				__func__, i, gpa + i * sizeof(e));
> +		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> 			return -EINVAL;
> -		}
> -		if (nested_vmx_store_msr_check(vcpu, &e)) {
> -			pr_debug_ratelimited(
> -				"%s check failed (%u, 0x%x, 0x%x)\n",
> -				__func__, i, e.index, e.reserved);
> -			return -EINVAL;
> -		}
> +
> 		if (kvm_get_msr(vcpu, e.index, &data)) {
> 			pr_debug_ratelimited(
> 				"%s cannot read MSR (%u, 0x%x)\n",
> --
> 2.24.0.rc1.363.gb1bccd3e3d-goog
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] kvm: vmx: Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES
  2019-11-05 19:19 ` [PATCH v2 2/4] kvm: vmx: Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES Aaron Lewis
@ 2019-11-05 21:28   ` Liran Alon
  0 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2019-11-05 21:28 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: kvm, Paolo Bonzini, Jim Mattson



> On 5 Nov 2019, at 21:19, Aaron Lewis <aaronlewis@google.com> wrote:
> 
> Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES.  This needs to be done
> due to the addition of the MSR-autostore area that will be added later
> in this series.  After that the name AUTOLOAD will no longer make sense.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>

NR_MSR_ENTRIES is a bit too generic in my opinion.
I would prefer to rename to NR_AUTO_LOADSTORE_MSRS.
The change itself is fine.

-Liran

> ---
> arch/x86/kvm/vmx/vmx.c | 4 ++--
> arch/x86/kvm/vmx/vmx.h | 4 ++--
> 2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index e7970a2e8eae..c0160ca9ddba 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -940,8 +940,8 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> 	if (!entry_only)
> 		j = find_msr(&m->host, msr);
> 
> -	if ((i < 0 && m->guest.nr == NR_AUTOLOAD_MSRS) ||
> -		(j < 0 &&  m->host.nr == NR_AUTOLOAD_MSRS)) {
> +	if ((i < 0 && m->guest.nr == NR_MSR_ENTRIES) ||
> +		(j < 0 &&  m->host.nr == NR_MSR_ENTRIES)) {
> 		printk_once(KERN_WARNING "Not enough msr switch entries. "
> 				"Can't add msr %x\n", msr);
> 		return;
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index bee16687dc0b..0c6835bd6945 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -22,11 +22,11 @@ extern u32 get_umwait_control_msr(void);
> 
> #define X2APIC_MSR(r) (APIC_BASE_MSR + ((r) >> 4))
> 
> -#define NR_AUTOLOAD_MSRS 8
> +#define NR_MSR_ENTRIES 8
> 
> struct vmx_msrs {
> 	unsigned int		nr;
> -	struct vmx_msr_entry	val[NR_AUTOLOAD_MSRS];
> +	struct vmx_msr_entry	val[NR_MSR_ENTRIES];
> };
> 
> struct shared_msr_entry {
> --
> 2.24.0.rc1.363.gb1bccd3e3d-goog
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index()
  2019-11-05 19:19 ` [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index() Aaron Lewis
@ 2019-11-05 21:31   ` Liran Alon
  2019-11-07  0:11     ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Liran Alon @ 2019-11-05 21:31 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: kvm, Paolo Bonzini, Jim Mattson



> On 5 Nov 2019, at 21:19, Aaron Lewis <aaronlewis@google.com> wrote:
> 
> Rename function find_msr() to vmx_find_msr_index() to share
> implementations between vmx.c and nested.c in an upcoming change.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 10 +++++-----
> arch/x86/kvm/vmx/vmx.h |  1 +
> 2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c0160ca9ddba..39c701730297 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -835,7 +835,7 @@ static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> 	vm_exit_controls_clearbit(vmx, exit);
> }
> 
> -static int find_msr(struct vmx_msrs *m, unsigned int msr)
> +int vmx_find_msr_index(struct vmx_msrs *m, u32 msr)

The change from static to non-static should happen in the next patch instead of this rename patch.
Otherwise, if the next patch is reverted, compiling vmx.c will result in a warning.

The rest of the patch looks fine.

-Liran

> {
> 	unsigned int i;
> 
> @@ -869,7 +869,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
> 		}
> 		break;
> 	}
> -	i = find_msr(&m->guest, msr);
> +	i = vmx_find_msr_index(&m->guest, msr);
> 	if (i < 0)
> 		goto skip_guest;
> 	--m->guest.nr;
> @@ -877,7 +877,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
> 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr);
> 
> skip_guest:
> -	i = find_msr(&m->host, msr);
> +	i = vmx_find_msr_index(&m->host, msr);
> 	if (i < 0)
> 		return;
> 
> @@ -936,9 +936,9 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> 		wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
> 	}
> 
> -	i = find_msr(&m->guest, msr);
> +	i = vmx_find_msr_index(&m->guest, msr);
> 	if (!entry_only)
> -		j = find_msr(&m->host, msr);
> +		j = vmx_find_msr_index(&m->host, msr);
> 
> 	if ((i < 0 && m->guest.nr == NR_MSR_ENTRIES) ||
> 		(j < 0 &&  m->host.nr == NR_MSR_ENTRIES)) {
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 0c6835bd6945..34b5fef603d8 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -334,6 +334,7 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
> struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr);
> void pt_update_intercept_for_msr(struct vcpu_vmx *vmx);
> void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
> +int vmx_find_msr_index(struct vmx_msrs *m, u32 msr);
> 
> #define POSTED_INTR_ON  0
> #define POSTED_INTR_SN  1
> --
> 2.24.0.rc1.363.gb1bccd3e3d-goog
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC
  2019-11-05 19:19 ` [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC Aaron Lewis
@ 2019-11-05 21:58   ` Liran Alon
  2019-11-05 22:27     ` Jim Mattson
  2019-11-05 22:48   ` Sean Christopherson
  1 sibling, 1 reply; 15+ messages in thread
From: Liran Alon @ 2019-11-05 21:58 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: kvm, Paolo Bonzini, Jim Mattson



> On 5 Nov 2019, at 21:19, Aaron Lewis <aaronlewis@google.com> wrote:
> 
> The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
> vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
> TSC value that might have been observed by L2 prior to VM-exit. The
> current implementation does not capture a very tight bound on this
> value.  To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
> vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
> MSR-store area.  When L0 processes the vmcs12 VM-exit MSR-store area
> during the emulation of an L2->L1 VM-exit, special-case the
> IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
> VM-exit MSR-store area to derive the value to be stored in the vmcs12
> VM-exit MSR-store area.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>

The patch looks correct to me and I had only some minor style comments below.
Reviewed-by: Liran Alon <liran.alon@oracle.com>

I think you may also consider to separate this patch into two:
First patch add all framework code without still using it specifically for MSR_IA32_TSC
and a second patch to use the framework for MSR_IA32_TSC case.

Just out of curiosity, may I ask which L1 hypervisor use this technique that you encountered this issue?

-Liran

> ---
> arch/x86/kvm/vmx/nested.c | 92 ++++++++++++++++++++++++++++++++++++---
> arch/x86/kvm/vmx/vmx.h    |  4 ++
> 2 files changed, 90 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 7b058d7b9fcc..cb2a92341eab 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -929,6 +929,37 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> 	return i + 1;
> }
> 
> +static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu,
> +					    u32 msr_index,
> +					    u64 *data)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +	/*
> +	 * If the L0 hypervisor stored a more accurate value for the TSC that
> +	 * does not include the time taken for emulation of the L2->L1
> +	 * VM-exit in L0, use the more accurate value.
> +	 */
> +	if (msr_index == MSR_IA32_TSC) {
> +		int index = vmx_find_msr_index(&vmx->msr_autostore.guest,
> +					       MSR_IA32_TSC);
> +
> +		if (index >= 0) {
> +			u64 val = vmx->msr_autostore.guest.val[index].value;
> +

Nit: I would remove the new-line here.

> +			*data = kvm_read_l1_tsc(vcpu, val);
> +			return true;
> +		}
> +	}
> +
> +	if (kvm_get_msr(vcpu, msr_index, data)) {
> +		pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__,
> +			msr_index);
> +		return false;
> +	}
> +	return true;
> +}
> +
> static bool read_and_check_msr_entry(struct kvm_vcpu *vcpu, u64 gpa, int i,
> 				     struct vmx_msr_entry *e)
> {
> @@ -963,12 +994,9 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> 		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> 			return -EINVAL;
> 
> -		if (kvm_get_msr(vcpu, e.index, &data)) {
> -			pr_debug_ratelimited(
> -				"%s cannot read MSR (%u, 0x%x)\n",
> -				__func__, i, e.index);
> +		if (!nested_vmx_get_vmexit_msr_value(vcpu, e.index, &data))
> 			return -EINVAL;
> -		}
> +
> 		if (kvm_vcpu_write_guest(vcpu,
> 					 gpa + i * sizeof(e) +
> 					     offsetof(struct vmx_msr_entry, value),
> @@ -982,6 +1010,51 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> 	return 0;
> }
> 
> +static bool nested_msr_store_list_has_msr(struct kvm_vcpu *vcpu, u32 msr_index)
> +{
> +	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> +	u32 count = vmcs12->vm_exit_msr_store_count;
> +	u64 gpa = vmcs12->vm_exit_msr_store_addr;
> +	struct vmx_msr_entry e;
> +	u32 i;
> +
> +	for (i = 0; i < count; i++) {
> +		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> +			return false;
> +
> +		if (e.index == msr_index)
> +			return true;
> +	}
> +	return false;
> +}
> +
> +static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu,
> +					   u32 msr_index)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	struct vmx_msrs *autostore = &vmx->msr_autostore.guest;
> +	int i = vmx_find_msr_index(autostore, msr_index);

Nit: I would rename “i" to “msr_autostore_index”.

> +	bool in_autostore_list = i >= 0;
> +	bool in_vmcs12_store_list;
> +	int last;
> +
> +	in_vmcs12_store_list = nested_msr_store_list_has_msr(vcpu, msr_index);

Nit: Why is “i" and “in_autostore_list” initialised at declaration but “in_vmcs12_store_list” initialised here?
I would prefer to just first declare variables and then have logic of method after declarations.

> +
> +	if (in_vmcs12_store_list && !in_autostore_list) {
> +		if (autostore->nr == NR_MSR_ENTRIES) {
> +			pr_warn_ratelimited(
> +				"Not enough msr entries in msr_autostore.  Can't add msr %x\n",
> +				msr_index);
> +			return;

I think it’s worth to add a comment here explaining that we don’t fail emulated VMEntry in this case
because on emulated VMExit, we will just get less accurate value by nested_vmx_get_vmexit_msr_value() using kvm_get_msr()
instead of reading value from vmcs02 VMExit MSR-store. But we are still able to emulate VMEntry properly.

> +		}
> +		last = autostore->nr++;
> +		autostore->val[last].index = msr_index;
> +	} else if (!in_vmcs12_store_list && in_autostore_list) {
> +		last = --autostore->nr;
> +		autostore->val[i] = autostore->val[last];
> +	}
> +}
> +
> static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
> {
> 	unsigned long invalid_mask;
> @@ -2027,7 +2100,7 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
> 	 * addresses are constant (for vmcs02), the counts can change based
> 	 * on L2's behavior, e.g. switching to/from long mode.
> 	 */
> -	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
> +	vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.guest.val));
> 	vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val));
> 	vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val));
> 
> @@ -2294,6 +2367,13 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
> 		vmcs_write64(EOI_EXIT_BITMAP3, vmcs12->eoi_exit_bitmap3);
> 	}
> 
> +	/*
> +	 * Make sure the msr_autostore list is up to date before we set the
> +	 * count in the vmcs02.
> +	 */
> +	prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC);
> +
> +	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.guest.nr);
> 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
> 
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 34b5fef603d8..0ab1562287af 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -230,6 +230,10 @@ struct vcpu_vmx {
> 		struct vmx_msrs host;
> 	} msr_autoload;
> 
> +	struct msr_autostore {
> +		struct vmx_msrs guest;
> +	} msr_autostore;
> +
> 	struct {
> 		int vm86_active;
> 		ulong save_rflags;
> --
> 2.24.0.rc1.363.gb1bccd3e3d-goog
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC
  2019-11-05 21:58   ` Liran Alon
@ 2019-11-05 22:27     ` Jim Mattson
  0 siblings, 0 replies; 15+ messages in thread
From: Jim Mattson @ 2019-11-05 22:27 UTC (permalink / raw)
  To: Liran Alon; +Cc: Aaron Lewis, kvm list, Paolo Bonzini

On Tue, Nov 5, 2019 at 1:58 PM Liran Alon <liran.alon@oracle.com> wrote:
>
>
>
> > On 5 Nov 2019, at 21:19, Aaron Lewis <aaronlewis@google.com> wrote:
> >
> > The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
> > vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
> > TSC value that might have been observed by L2 prior to VM-exit. The
> > current implementation does not capture a very tight bound on this
> > value.  To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
> > vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
> > MSR-store area.  When L0 processes the vmcs12 VM-exit MSR-store area
> > during the emulation of an L2->L1 VM-exit, special-case the
> > IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
> > VM-exit MSR-store area to derive the value to be stored in the vmcs12
> > VM-exit MSR-store area.
> >
> > Reviewed-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
>
> The patch looks correct to me and I had only some minor style comments below.
> Reviewed-by: Liran Alon <liran.alon@oracle.com>
>
> I think you may also consider to separate this patch into two:
> First patch add all framework code without still using it specifically for MSR_IA32_TSC
> and a second patch to use the framework for MSR_IA32_TSC case.
>
> Just out of curiosity, may I ask which L1 hypervisor use this technique that you encountered this issue?

It's a proprietary type 2 hypervisor that runs on Linux.

> -Liran

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC
  2019-11-05 19:19 ` [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC Aaron Lewis
  2019-11-05 21:58   ` Liran Alon
@ 2019-11-05 22:48   ` Sean Christopherson
  2019-11-06 18:54     ` Jim Mattson
  2019-11-08 18:31     ` Jim Mattson
  1 sibling, 2 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-11-05 22:48 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: kvm, Paolo Bonzini, Jim Mattson

On Tue, Nov 05, 2019 at 11:19:10AM -0800, Aaron Lewis wrote:
> The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
> vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
> TSC value that might have been observed by L2 prior to VM-exit. The
> current implementation does not capture a very tight bound on this
> value.  To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
> vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
> MSR-store area.  When L0 processes the vmcs12 VM-exit MSR-store area
> during the emulation of an L2->L1 VM-exit, special-case the
> IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
> VM-exit MSR-store area to derive the value to be stored in the vmcs12
> VM-exit MSR-store area.

Given that this is a one-off case for a nested guest, is it really worth
adding the infrastructure to allow storing arbitrary MSRs on exit?  The
MSR list isn't any faster than plain ol' RDMSR, so the only use case is
likely limited to something like this, e.g. prior to this nested case, KVM
has existed for well over a decade without needing to store an MSR on
VM-Exit.

Making this a truly one-off case would eliminate most of the refactoring
and would avoid the bikeshedding in patch 2/2 over how to rename
NR_AUTOLOAD_MSRS (I hate the term "AUTO" for whatever reason).


E.g.:

prepare_vmcs02_constant_state()

	vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->nested.l2_tsc));

prepare_vmcs02_rare():

	if (nested_msr_store_list_has_msr(vcpu, MSR_IA32_TSC))
		vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 1);
	else
		vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);



> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 92 ++++++++++++++++++++++++++++++++++++---
>  arch/x86/kvm/vmx/vmx.h    |  4 ++
>  2 files changed, 90 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 7b058d7b9fcc..cb2a92341eab 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -929,6 +929,37 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
>  	return i + 1;
>  }
> 
> +static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu,
> +					    u32 msr_index,
> +					    u64 *data)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +	/*
> +	 * If the L0 hypervisor stored a more accurate value for the TSC that
> +	 * does not include the time taken for emulation of the L2->L1
> +	 * VM-exit in L0, use the more accurate value.
> +	 */
> +	if (msr_index == MSR_IA32_TSC) {
> +		int index = vmx_find_msr_index(&vmx->msr_autostore.guest,
> +					       MSR_IA32_TSC);
> +
> +		if (index >= 0) {
> +			u64 val = vmx->msr_autostore.guest.val[index].value;
> +
> +			*data = kvm_read_l1_tsc(vcpu, val);
> +			return true;
> +		}
> +	}
> +
> +	if (kvm_get_msr(vcpu, msr_index, data)) {
> +		pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__,
> +			msr_index);
> +		return false;
> +	}
> +	return true;
> +}
> +
>  static bool read_and_check_msr_entry(struct kvm_vcpu *vcpu, u64 gpa, int i,
>  				     struct vmx_msr_entry *e)
>  {
> @@ -963,12 +994,9 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
>  		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
>  			return -EINVAL;
> 
> -		if (kvm_get_msr(vcpu, e.index, &data)) {
> -			pr_debug_ratelimited(
> -				"%s cannot read MSR (%u, 0x%x)\n",
> -				__func__, i, e.index);
> +		if (!nested_vmx_get_vmexit_msr_value(vcpu, e.index, &data))
>  			return -EINVAL;
> -		}
> +
>  		if (kvm_vcpu_write_guest(vcpu,
>  					 gpa + i * sizeof(e) +
>  					     offsetof(struct vmx_msr_entry, value),
> @@ -982,6 +1010,51 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
>  	return 0;
>  }
> 
> +static bool nested_msr_store_list_has_msr(struct kvm_vcpu *vcpu, u32 msr_index)
> +{
> +	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> +	u32 count = vmcs12->vm_exit_msr_store_count;
> +	u64 gpa = vmcs12->vm_exit_msr_store_addr;
> +	struct vmx_msr_entry e;
> +	u32 i;
> +
> +	for (i = 0; i < count; i++) {
> +		if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> +			return false;
> +
> +		if (e.index == msr_index)
> +			return true;
> +	}
> +	return false;
> +}
> +
> +static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu,
> +					   u32 msr_index)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	struct vmx_msrs *autostore = &vmx->msr_autostore.guest;
> +	int i = vmx_find_msr_index(autostore, msr_index);
> +	bool in_autostore_list = i >= 0;
> +	bool in_vmcs12_store_list;
> +	int last;
> +
> +	in_vmcs12_store_list = nested_msr_store_list_has_msr(vcpu, msr_index);
> +
> +	if (in_vmcs12_store_list && !in_autostore_list) {
> +		if (autostore->nr == NR_MSR_ENTRIES) {
> +			pr_warn_ratelimited(
> +				"Not enough msr entries in msr_autostore.  Can't add msr %x\n",
> +				msr_index);
> +			return;
> +		}
> +		last = autostore->nr++;
> +		autostore->val[last].index = msr_index;
> +	} else if (!in_vmcs12_store_list && in_autostore_list) {
> +		last = --autostore->nr;
> +		autostore->val[i] = autostore->val[last];
> +	}
> +}
> +
>  static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
>  {
>  	unsigned long invalid_mask;
> @@ -2027,7 +2100,7 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
>  	 * addresses are constant (for vmcs02), the counts can change based
>  	 * on L2's behavior, e.g. switching to/from long mode.
>  	 */
> -	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
> +	vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.guest.val));
>  	vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val));
>  	vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val));
> 
> @@ -2294,6 +2367,13 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
>  		vmcs_write64(EOI_EXIT_BITMAP3, vmcs12->eoi_exit_bitmap3);
>  	}
> 
> +	/*
> +	 * Make sure the msr_autostore list is up to date before we set the
> +	 * count in the vmcs02.
> +	 */
> +	prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC);
> +
> +	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.guest.nr);
>  	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
>  	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
> 
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 34b5fef603d8..0ab1562287af 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -230,6 +230,10 @@ struct vcpu_vmx {
>  		struct vmx_msrs host;
>  	} msr_autoload;
> 
> +	struct msr_autostore {
> +		struct vmx_msrs guest;
> +	} msr_autostore;
> +
>  	struct {
>  		int vm86_active;
>  		ulong save_rflags;
> --
> 2.24.0.rc1.363.gb1bccd3e3d-goog
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC
  2019-11-05 22:48   ` Sean Christopherson
@ 2019-11-06 18:54     ` Jim Mattson
  2019-11-08 18:31     ` Jim Mattson
  1 sibling, 0 replies; 15+ messages in thread
From: Jim Mattson @ 2019-11-06 18:54 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Aaron Lewis, kvm list, Paolo Bonzini

On Tue, Nov 5, 2019 at 2:49 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Nov 05, 2019 at 11:19:10AM -0800, Aaron Lewis wrote:
> > The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
> > vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
> > TSC value that might have been observed by L2 prior to VM-exit. The
> > current implementation does not capture a very tight bound on this
> > value.  To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
> > vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
> > MSR-store area.  When L0 processes the vmcs12 VM-exit MSR-store area
> > during the emulation of an L2->L1 VM-exit, special-case the
> > IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
> > VM-exit MSR-store area to derive the value to be stored in the vmcs12
> > VM-exit MSR-store area.
>
> Given that this is a one-off case for a nested guest, is it really worth
> adding the infrastructure to allow storing arbitrary MSRs on exit?  The
> MSR list isn't any faster than plain ol' RDMSR, so the only use case is
> likely limited to something like this, e.g. prior to this nested case, KVM
> has existed for well over a decade without needing to store an MSR on
> VM-Exit.
>
> Making this a truly one-off case would eliminate most of the refactoring
> and would avoid the bikeshedding in patch 2/2 over how to rename
> NR_AUTOLOAD_MSRS (I hate the term "AUTO" for whatever reason).

I thought bikeshedding was de rigueur for kvm submissions. :-) And I'm
with you on "AUTO."

>
> E.g.:
>
> prepare_vmcs02_constant_state()
>
>         vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->nested.l2_tsc));
>
> prepare_vmcs02_rare():
>
>         if (nested_msr_store_list_has_msr(vcpu, MSR_IA32_TSC))
>                 vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 1);
>         else
>                 vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
>
>

That seems like a bit of a hack, to be honest. (And if we're going
that way, why not continue down the slippery slope and abandon
nested_msr_store_list_has_msr for nested_msr_store_list_has_tsc?)

> > Reviewed-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 92 ++++++++++++++++++++++++++++++++++++---
> >  arch/x86/kvm/vmx/vmx.h    |  4 ++
> >  2 files changed, 90 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 7b058d7b9fcc..cb2a92341eab 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -929,6 +929,37 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> >       return i + 1;
> >  }
> >
> > +static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu,
> > +                                         u32 msr_index,
> > +                                         u64 *data)
> > +{
> > +     struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +
> > +     /*
> > +      * If the L0 hypervisor stored a more accurate value for the TSC that
> > +      * does not include the time taken for emulation of the L2->L1
> > +      * VM-exit in L0, use the more accurate value.
> > +      */
> > +     if (msr_index == MSR_IA32_TSC) {
> > +             int index = vmx_find_msr_index(&vmx->msr_autostore.guest,
> > +                                            MSR_IA32_TSC);
> > +
> > +             if (index >= 0) {
> > +                     u64 val = vmx->msr_autostore.guest.val[index].value;
> > +
> > +                     *data = kvm_read_l1_tsc(vcpu, val);
> > +                     return true;
> > +             }
> > +     }
> > +
> > +     if (kvm_get_msr(vcpu, msr_index, data)) {
> > +             pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__,
> > +                     msr_index);
> > +             return false;
> > +     }
> > +     return true;
> > +}
> > +
> >  static bool read_and_check_msr_entry(struct kvm_vcpu *vcpu, u64 gpa, int i,
> >                                    struct vmx_msr_entry *e)
> >  {
> > @@ -963,12 +994,9 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> >               if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> >                       return -EINVAL;
> >
> > -             if (kvm_get_msr(vcpu, e.index, &data)) {
> > -                     pr_debug_ratelimited(
> > -                             "%s cannot read MSR (%u, 0x%x)\n",
> > -                             __func__, i, e.index);
> > +             if (!nested_vmx_get_vmexit_msr_value(vcpu, e.index, &data))
> >                       return -EINVAL;
> > -             }
> > +
> >               if (kvm_vcpu_write_guest(vcpu,
> >                                        gpa + i * sizeof(e) +
> >                                            offsetof(struct vmx_msr_entry, value),
> > @@ -982,6 +1010,51 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> >       return 0;
> >  }
> >
> > +static bool nested_msr_store_list_has_msr(struct kvm_vcpu *vcpu, u32 msr_index)
> > +{
> > +     struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> > +     u32 count = vmcs12->vm_exit_msr_store_count;
> > +     u64 gpa = vmcs12->vm_exit_msr_store_addr;
> > +     struct vmx_msr_entry e;
> > +     u32 i;
> > +
> > +     for (i = 0; i < count; i++) {
> > +             if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> > +                     return false;
> > +
> > +             if (e.index == msr_index)
> > +                     return true;
> > +     }
> > +     return false;
> > +}
> > +
> > +static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu,
> > +                                        u32 msr_index)
> > +{
> > +     struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +     struct vmx_msrs *autostore = &vmx->msr_autostore.guest;
> > +     int i = vmx_find_msr_index(autostore, msr_index);
> > +     bool in_autostore_list = i >= 0;
> > +     bool in_vmcs12_store_list;
> > +     int last;
> > +
> > +     in_vmcs12_store_list = nested_msr_store_list_has_msr(vcpu, msr_index);
> > +
> > +     if (in_vmcs12_store_list && !in_autostore_list) {
> > +             if (autostore->nr == NR_MSR_ENTRIES) {
> > +                     pr_warn_ratelimited(
> > +                             "Not enough msr entries in msr_autostore.  Can't add msr %x\n",
> > +                             msr_index);
> > +                     return;
> > +             }
> > +             last = autostore->nr++;
> > +             autostore->val[last].index = msr_index;
> > +     } else if (!in_vmcs12_store_list && in_autostore_list) {
> > +             last = --autostore->nr;
> > +             autostore->val[i] = autostore->val[last];
> > +     }
> > +}
> > +
> >  static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
> >  {
> >       unsigned long invalid_mask;
> > @@ -2027,7 +2100,7 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
> >        * addresses are constant (for vmcs02), the counts can change based
> >        * on L2's behavior, e.g. switching to/from long mode.
> >        */
> > -     vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
> > +     vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.guest.val));
> >       vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val));
> >       vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val));
> >
> > @@ -2294,6 +2367,13 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
> >               vmcs_write64(EOI_EXIT_BITMAP3, vmcs12->eoi_exit_bitmap3);
> >       }
> >
> > +     /*
> > +      * Make sure the msr_autostore list is up to date before we set the
> > +      * count in the vmcs02.
> > +      */
> > +     prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC);
> > +
> > +     vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.guest.nr);
> >       vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> >       vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 34b5fef603d8..0ab1562287af 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -230,6 +230,10 @@ struct vcpu_vmx {
> >               struct vmx_msrs host;
> >       } msr_autoload;
> >
> > +     struct msr_autostore {
> > +             struct vmx_msrs guest;
> > +     } msr_autostore;
> > +
> >       struct {
> >               int vm86_active;
> >               ulong save_rflags;
> > --
> > 2.24.0.rc1.363.gb1bccd3e3d-goog
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index()
  2019-11-05 21:31   ` Liran Alon
@ 2019-11-07  0:11     ` Jim Mattson
  2019-11-07  0:35       ` Liran Alon
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2019-11-07  0:11 UTC (permalink / raw)
  To: Liran Alon; +Cc: Aaron Lewis, kvm list, Paolo Bonzini

On Tue, Nov 5, 2019 at 1:31 PM Liran Alon <liran.alon@oracle.com> wrote:
>
>
>
> > On 5 Nov 2019, at 21:19, Aaron Lewis <aaronlewis@google.com> wrote:
> >
> > Rename function find_msr() to vmx_find_msr_index() to share
> > implementations between vmx.c and nested.c in an upcoming change.
> >
> > Reviewed-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> > ---
> > arch/x86/kvm/vmx/vmx.c | 10 +++++-----
> > arch/x86/kvm/vmx/vmx.h |  1 +
> > 2 files changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index c0160ca9ddba..39c701730297 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -835,7 +835,7 @@ static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> >       vm_exit_controls_clearbit(vmx, exit);
> > }
> >
> > -static int find_msr(struct vmx_msrs *m, unsigned int msr)
> > +int vmx_find_msr_index(struct vmx_msrs *m, u32 msr)
>
> The change from static to non-static should happen in the next patch instead of this rename patch.
> Otherwise, if the next patch is reverted, compiling vmx.c will result in a warning.

What warning are you anticipating?

> The rest of the patch looks fine.
>
> -Liran
>
> > {
> >       unsigned int i;
> >
> > @@ -869,7 +869,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
> >               }
> >               break;
> >       }
> > -     i = find_msr(&m->guest, msr);
> > +     i = vmx_find_msr_index(&m->guest, msr);
> >       if (i < 0)
> >               goto skip_guest;
> >       --m->guest.nr;
> > @@ -877,7 +877,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
> >       vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr);
> >
> > skip_guest:
> > -     i = find_msr(&m->host, msr);
> > +     i = vmx_find_msr_index(&m->host, msr);
> >       if (i < 0)
> >               return;
> >
> > @@ -936,9 +936,9 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> >               wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
> >       }
> >
> > -     i = find_msr(&m->guest, msr);
> > +     i = vmx_find_msr_index(&m->guest, msr);
> >       if (!entry_only)
> > -             j = find_msr(&m->host, msr);
> > +             j = vmx_find_msr_index(&m->host, msr);
> >
> >       if ((i < 0 && m->guest.nr == NR_MSR_ENTRIES) ||
> >               (j < 0 &&  m->host.nr == NR_MSR_ENTRIES)) {
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 0c6835bd6945..34b5fef603d8 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -334,6 +334,7 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
> > struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr);
> > void pt_update_intercept_for_msr(struct vcpu_vmx *vmx);
> > void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
> > +int vmx_find_msr_index(struct vmx_msrs *m, u32 msr);
> >
> > #define POSTED_INTR_ON  0
> > #define POSTED_INTR_SN  1
> > --
> > 2.24.0.rc1.363.gb1bccd3e3d-goog
> >
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index()
  2019-11-07  0:11     ` Jim Mattson
@ 2019-11-07  0:35       ` Liran Alon
  0 siblings, 0 replies; 15+ messages in thread
From: Liran Alon @ 2019-11-07  0:35 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Aaron Lewis, kvm list, Paolo Bonzini



> On 7 Nov 2019, at 2:11, Jim Mattson <jmattson@google.com> wrote:
> 
> On Tue, Nov 5, 2019 at 1:31 PM Liran Alon <liran.alon@oracle.com> wrote:
>> 
>> 
>> 
>>> On 5 Nov 2019, at 21:19, Aaron Lewis <aaronlewis@google.com> wrote:
>>> 
>>> Rename function find_msr() to vmx_find_msr_index() to share
>>> implementations between vmx.c and nested.c in an upcoming change.
>>> 
>>> Reviewed-by: Jim Mattson <jmattson@google.com>
>>> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
>>> ---
>>> arch/x86/kvm/vmx/vmx.c | 10 +++++-----
>>> arch/x86/kvm/vmx/vmx.h |  1 +
>>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index c0160ca9ddba..39c701730297 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -835,7 +835,7 @@ static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
>>>      vm_exit_controls_clearbit(vmx, exit);
>>> }
>>> 
>>> -static int find_msr(struct vmx_msrs *m, unsigned int msr)
>>> +int vmx_find_msr_index(struct vmx_msrs *m, u32 msr)
>> 
>> The change from static to non-static should happen in the next patch instead of this rename patch.
>> Otherwise, if the next patch is reverted, compiling vmx.c will result in a warning.
> 
> What warning are you anticipating?

Sorry right this doesn’t produce a warning.
However, comment still stands that function should change from static to non-static only once it’s needed outside of source file.
Which happens on next patch. Just as a good practice.

> 
>> The rest of the patch looks fine.
>> 
>> -Liran
>> 
>>> {
>>>      unsigned int i;
>>> 
>>> @@ -869,7 +869,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
>>>              }
>>>              break;
>>>      }
>>> -     i = find_msr(&m->guest, msr);
>>> +     i = vmx_find_msr_index(&m->guest, msr);
>>>      if (i < 0)
>>>              goto skip_guest;
>>>      --m->guest.nr;
>>> @@ -877,7 +877,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
>>>      vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr);
>>> 
>>> skip_guest:
>>> -     i = find_msr(&m->host, msr);
>>> +     i = vmx_find_msr_index(&m->host, msr);
>>>      if (i < 0)
>>>              return;
>>> 
>>> @@ -936,9 +936,9 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>>>              wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
>>>      }
>>> 
>>> -     i = find_msr(&m->guest, msr);
>>> +     i = vmx_find_msr_index(&m->guest, msr);
>>>      if (!entry_only)
>>> -             j = find_msr(&m->host, msr);
>>> +             j = vmx_find_msr_index(&m->host, msr);
>>> 
>>>      if ((i < 0 && m->guest.nr == NR_MSR_ENTRIES) ||
>>>              (j < 0 &&  m->host.nr == NR_MSR_ENTRIES)) {
>>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>>> index 0c6835bd6945..34b5fef603d8 100644
>>> --- a/arch/x86/kvm/vmx/vmx.h
>>> +++ b/arch/x86/kvm/vmx/vmx.h
>>> @@ -334,6 +334,7 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
>>> struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr);
>>> void pt_update_intercept_for_msr(struct vcpu_vmx *vmx);
>>> void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
>>> +int vmx_find_msr_index(struct vmx_msrs *m, u32 msr);
>>> 
>>> #define POSTED_INTR_ON  0
>>> #define POSTED_INTR_SN  1
>>> --
>>> 2.24.0.rc1.363.gb1bccd3e3d-goog
>>> 
>> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC
  2019-11-05 22:48   ` Sean Christopherson
  2019-11-06 18:54     ` Jim Mattson
@ 2019-11-08 18:31     ` Jim Mattson
  1 sibling, 0 replies; 15+ messages in thread
From: Jim Mattson @ 2019-11-08 18:31 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Aaron Lewis, kvm list, Paolo Bonzini

On Tue, Nov 5, 2019 at 2:49 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Nov 05, 2019 at 11:19:10AM -0800, Aaron Lewis wrote:
> > The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
> > vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
> > TSC value that might have been observed by L2 prior to VM-exit. The
> > current implementation does not capture a very tight bound on this
> > value.  To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
> > vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
> > MSR-store area.  When L0 processes the vmcs12 VM-exit MSR-store area
> > during the emulation of an L2->L1 VM-exit, special-case the
> > IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
> > VM-exit MSR-store area to derive the value to be stored in the vmcs12
> > VM-exit MSR-store area.
>
> Given that this is a one-off case for a nested guest, is it really worth
> adding the infrastructure to allow storing arbitrary MSRs on exit?  The
> MSR list isn't any faster than plain ol' RDMSR, so the only use case is
> likely limited to something like this, e.g. prior to this nested case, KVM
> has existed for well over a decade without needing to store an MSR on
> VM-Exit.

Actually, I can envision using the VM-exit MSR-store list for PMCs in
the future, if we ever implement true PMU multiplexing between the
host and the guest. But maybe that's a pipe dream.

> Making this a truly one-off case would eliminate most of the refactoring
> and would avoid the bikeshedding in patch 2/2 over how to rename
> NR_AUTOLOAD_MSRS (I hate the term "AUTO" for whatever reason).
>
>
> E.g.:
>
> prepare_vmcs02_constant_state()
>
>         vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->nested.l2_tsc));
>
> prepare_vmcs02_rare():
>
>         if (nested_msr_store_list_has_msr(vcpu, MSR_IA32_TSC))
>                 vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 1);
>         else
>                 vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
>
>
>
> > Reviewed-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 92 ++++++++++++++++++++++++++++++++++++---
> >  arch/x86/kvm/vmx/vmx.h    |  4 ++
> >  2 files changed, 90 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 7b058d7b9fcc..cb2a92341eab 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -929,6 +929,37 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> >       return i + 1;
> >  }
> >
> > +static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu,
> > +                                         u32 msr_index,
> > +                                         u64 *data)
> > +{
> > +     struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +
> > +     /*
> > +      * If the L0 hypervisor stored a more accurate value for the TSC that
> > +      * does not include the time taken for emulation of the L2->L1
> > +      * VM-exit in L0, use the more accurate value.
> > +      */
> > +     if (msr_index == MSR_IA32_TSC) {
> > +             int index = vmx_find_msr_index(&vmx->msr_autostore.guest,
> > +                                            MSR_IA32_TSC);
> > +
> > +             if (index >= 0) {
> > +                     u64 val = vmx->msr_autostore.guest.val[index].value;
> > +
> > +                     *data = kvm_read_l1_tsc(vcpu, val);
> > +                     return true;
> > +             }
> > +     }
> > +
> > +     if (kvm_get_msr(vcpu, msr_index, data)) {
> > +             pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__,
> > +                     msr_index);
> > +             return false;
> > +     }
> > +     return true;
> > +}
> > +
> >  static bool read_and_check_msr_entry(struct kvm_vcpu *vcpu, u64 gpa, int i,
> >                                    struct vmx_msr_entry *e)
> >  {
> > @@ -963,12 +994,9 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> >               if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> >                       return -EINVAL;
> >
> > -             if (kvm_get_msr(vcpu, e.index, &data)) {
> > -                     pr_debug_ratelimited(
> > -                             "%s cannot read MSR (%u, 0x%x)\n",
> > -                             __func__, i, e.index);
> > +             if (!nested_vmx_get_vmexit_msr_value(vcpu, e.index, &data))
> >                       return -EINVAL;
> > -             }
> > +
> >               if (kvm_vcpu_write_guest(vcpu,
> >                                        gpa + i * sizeof(e) +
> >                                            offsetof(struct vmx_msr_entry, value),
> > @@ -982,6 +1010,51 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
> >       return 0;
> >  }
> >
> > +static bool nested_msr_store_list_has_msr(struct kvm_vcpu *vcpu, u32 msr_index)
> > +{
> > +     struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> > +     u32 count = vmcs12->vm_exit_msr_store_count;
> > +     u64 gpa = vmcs12->vm_exit_msr_store_addr;
> > +     struct vmx_msr_entry e;
> > +     u32 i;
> > +
> > +     for (i = 0; i < count; i++) {
> > +             if (!read_and_check_msr_entry(vcpu, gpa, i, &e))
> > +                     return false;
> > +
> > +             if (e.index == msr_index)
> > +                     return true;
> > +     }
> > +     return false;
> > +}
> > +
> > +static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu,
> > +                                        u32 msr_index)
> > +{
> > +     struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +     struct vmx_msrs *autostore = &vmx->msr_autostore.guest;
> > +     int i = vmx_find_msr_index(autostore, msr_index);
> > +     bool in_autostore_list = i >= 0;
> > +     bool in_vmcs12_store_list;
> > +     int last;
> > +
> > +     in_vmcs12_store_list = nested_msr_store_list_has_msr(vcpu, msr_index);
> > +
> > +     if (in_vmcs12_store_list && !in_autostore_list) {
> > +             if (autostore->nr == NR_MSR_ENTRIES) {
> > +                     pr_warn_ratelimited(
> > +                             "Not enough msr entries in msr_autostore.  Can't add msr %x\n",
> > +                             msr_index);
> > +                     return;
> > +             }
> > +             last = autostore->nr++;
> > +             autostore->val[last].index = msr_index;
> > +     } else if (!in_vmcs12_store_list && in_autostore_list) {
> > +             last = --autostore->nr;
> > +             autostore->val[i] = autostore->val[last];
> > +     }
> > +}
> > +
> >  static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
> >  {
> >       unsigned long invalid_mask;
> > @@ -2027,7 +2100,7 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
> >        * addresses are constant (for vmcs02), the counts can change based
> >        * on L2's behavior, e.g. switching to/from long mode.
> >        */
> > -     vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
> > +     vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.guest.val));
> >       vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val));
> >       vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val));
> >
> > @@ -2294,6 +2367,13 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
> >               vmcs_write64(EOI_EXIT_BITMAP3, vmcs12->eoi_exit_bitmap3);
> >       }
> >
> > +     /*
> > +      * Make sure the msr_autostore list is up to date before we set the
> > +      * count in the vmcs02.
> > +      */
> > +     prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC);
> > +
> > +     vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.guest.nr);
> >       vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> >       vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 34b5fef603d8..0ab1562287af 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -230,6 +230,10 @@ struct vcpu_vmx {
> >               struct vmx_msrs host;
> >       } msr_autoload;
> >
> > +     struct msr_autostore {
> > +             struct vmx_msrs guest;
> > +     } msr_autostore;
> > +
> >       struct {
> >               int vm86_active;
> >               ulong save_rflags;
> > --
> > 2.24.0.rc1.363.gb1bccd3e3d-goog
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-11-08 18:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-05 19:19 [PATCH v2 0/4] Add support for capturing the highest observable L2 TSC Aaron Lewis
2019-11-05 19:19 ` [PATCH v2 1/4] kvm: nested: Introduce read_and_check_msr_entry() Aaron Lewis
2019-11-05 21:27   ` Liran Alon
2019-11-05 19:19 ` [PATCH v2 2/4] kvm: vmx: Rename NR_AUTOLOAD_MSRS to NR_MSR_ENTRIES Aaron Lewis
2019-11-05 21:28   ` Liran Alon
2019-11-05 19:19 ` [PATCH v2 3/4] kvm: vmx: Rename function find_msr() to vmx_find_msr_index() Aaron Lewis
2019-11-05 21:31   ` Liran Alon
2019-11-07  0:11     ` Jim Mattson
2019-11-07  0:35       ` Liran Alon
2019-11-05 19:19 ` [PATCH v2 4/4] KVM: nVMX: Add support for capturing highest observable L2 TSC Aaron Lewis
2019-11-05 21:58   ` Liran Alon
2019-11-05 22:27     ` Jim Mattson
2019-11-05 22:48   ` Sean Christopherson
2019-11-06 18:54     ` Jim Mattson
2019-11-08 18:31     ` Jim Mattson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.