linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature
@ 2022-07-13 15:05 Vitaly Kuznetsov
  2022-07-13 15:05 ` [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-13 15:05 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Maxim Levitsky, linux-kernel

Normally, genuine Hyper-V doesn't expose architectural invariant TSC
(CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
(HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
PV MSR is set, invariant TSC bit starts to show up in CPUID. When the 
feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.

Note: strictly speaking, KVM doesn't have to have the feature as exposing
raw invariant TSC bit (CPUID.80000007H:EDX[8]) also seems to work for
modern Windows versions. The feature is, however, tiny and straitforward
and gives additional flexibility so why not.

Vitaly Kuznetsov (3):
  KVM: x86: Hyper-V invariant TSC control
  KVM: selftests: Fix wrmsr_safe()
  KVM: selftests: Test Hyper-V invariant TSC control

 arch/x86/include/asm/kvm_host.h               |  1 +
 arch/x86/kvm/cpuid.c                          |  7 ++
 arch/x86/kvm/hyperv.c                         | 19 +++++
 arch/x86/kvm/hyperv.h                         | 15 ++++
 arch/x86/kvm/x86.c                            |  4 +-
 .../selftests/kvm/include/x86_64/processor.h  |  2 +-
 .../selftests/kvm/x86_64/hyperv_features.c    | 73 ++++++++++++++++++-
 7 files changed, 115 insertions(+), 6 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-07-13 15:05 [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
@ 2022-07-13 15:05 ` Vitaly Kuznetsov
  2022-07-14  9:25   ` Maxim Levitsky
  2022-07-13 15:05 ` [PATCH 2/3] KVM: selftests: Fix wrmsr_safe() Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-13 15:05 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Maxim Levitsky, linux-kernel

Normally, genuine Hyper-V doesn't expose architectural invariant TSC
(CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
(HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.

Add the feature to KVM. Keep CPUID output intact when the feature
wasn't exposed to L1 and implement the required logic for hiding
invariant TSC when the feature was exposed and invariant TSC control
MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
disable the feature once it was enabled.

For the reference, for linux guests, support for the feature was added
in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            |  7 +++++++
 arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
 arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
 arch/x86/kvm/x86.c              |  4 +++-
 5 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index de5a149d0971..88553f0b524c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1022,6 +1022,7 @@ struct kvm_hv {
 	u64 hv_reenlightenment_control;
 	u64 hv_tsc_emulation_control;
 	u64 hv_tsc_emulation_status;
+	u64 hv_invtsc;
 
 	/* How many vCPUs have VP index != vCPU index */
 	atomic_t num_mismatched_vp_indexes;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index d47222ab8e6e..788df2eb1ec4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1404,6 +1404,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 			    (data & TSX_CTRL_CPUID_CLEAR))
 				*ebx &= ~(F(RTM) | F(HLE));
 		}
+		/*
+		 * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
+		 * guests if needed.
+		 */
+		if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))
+			*edx &= ~(1 << 8);
+
 	} else {
 		*eax = *ebx = *ecx = *edx = 0;
 		/*
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e2e95a6fccfd..0d8e6526a839 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
 	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
 	case HV_X64_MSR_SYNDBG_OPTIONS:
 	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
 		r = true;
@@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
 		return hv_vcpu->cpuid_cache.features_eax &
 			HV_ACCESS_REENLIGHTENMENT;
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
+		return hv_vcpu->cpuid_cache.features_eax &
+			HV_ACCESS_TSC_INVARIANT;
 	case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
 	case HV_X64_MSR_CRASH_CTL:
 		return hv_vcpu->cpuid_cache.features_edx &
@@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
 		if (!host)
 			return 1;
 		break;
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
+		/* Only bit 0 is supported */
+		if (data & ~BIT_ULL(0))
+			return 1;
+
+		/* The feature can't be disabled from the guest */
+		if (!host && hv->hv_invtsc && !data)
+			return 1;
+
+		hv->hv_invtsc = data;
+		break;
 	case HV_X64_MSR_SYNDBG_OPTIONS:
 	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
 		return syndbg_set_msr(vcpu, msr, data, host);
@@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
 		data = hv->hv_tsc_emulation_status;
 		break;
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
+		data = hv->hv_invtsc;
+		break;
 	case HV_X64_MSR_SYNDBG_OPTIONS:
 	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
 		return syndbg_get_msr(vcpu, msr, pdata, host);
@@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
 			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
 			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
+			ent->eax |= HV_ACCESS_TSC_INVARIANT;
 
 			ent->ebx |= HV_POST_MESSAGES;
 			ent->ebx |= HV_SIGNAL_EVENTS;
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index da2737f2a956..1a6316ab55eb 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
 			     HV_SYNIC_STIMER_COUNT);
 }
 
+/*
+ * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
+ * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
+ */
+static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
+
+	if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)
+		return !hv->hv_invtsc;
+
+	return false;
+}
+
 void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
 
 void kvm_hv_setup_tsc_page(struct kvm *kvm,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 567d13405445..322e0a544823 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1466,7 +1466,7 @@ static const u32 emulated_msrs_all[] = {
 	HV_X64_MSR_STIMER0_CONFIG,
 	HV_X64_MSR_VP_ASSIST_PAGE,
 	HV_X64_MSR_REENLIGHTENMENT_CONTROL, HV_X64_MSR_TSC_EMULATION_CONTROL,
-	HV_X64_MSR_TSC_EMULATION_STATUS,
+	HV_X64_MSR_TSC_EMULATION_STATUS, HV_X64_MSR_TSC_INVARIANT_CONTROL,
 	HV_X64_MSR_SYNDBG_OPTIONS,
 	HV_X64_MSR_SYNDBG_CONTROL, HV_X64_MSR_SYNDBG_STATUS,
 	HV_X64_MSR_SYNDBG_SEND_BUFFER, HV_X64_MSR_SYNDBG_RECV_BUFFER,
@@ -3769,6 +3769,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
 		return kvm_hv_set_msr_common(vcpu, msr, data,
 					     msr_info->host_initiated);
 	case MSR_IA32_BBL_CR_CTL3:
@@ -4139,6 +4140,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
 		return kvm_hv_get_msr_common(vcpu,
 					     msr_info->index, &msr_info->data,
 					     msr_info->host_initiated);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] KVM: selftests: Fix wrmsr_safe()
  2022-07-13 15:05 [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
  2022-07-13 15:05 ` [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
@ 2022-07-13 15:05 ` Vitaly Kuznetsov
  2022-07-13 15:41   ` Sean Christopherson
  2022-07-13 15:05 ` [PATCH 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
  2022-07-14  9:24 ` [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Maxim Levitsky
  3 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-13 15:05 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Maxim Levitsky, linux-kernel

It seems to be a misconception that "A" places an u64 operand to
EAX:EDX, at least with GCC11.

While writing a new test, I've noticed that wrmsr_safe() tries putting
garbage to the upper bits of the MSR, e.g.:

  kvm_exit:             reason MSR_WRITE rip 0x402919 info 0 0
  kvm_msr:              msr_write 40000118 = 0x60000000001 (#GP)
...
when it was supposed to write '1'. Apparently, "A" works the same as
"a" and not as EAX/EDX. Here's the relevant disassembled part:

With "A":

	48 8b 43 08          	mov    0x8(%rbx),%rax
	49 b9 ba da ca ba 0a 	movabs $0xabacadaba,%r9
	00 00 00
	4c 8d 15 07 00 00 00 	lea    0x7(%rip),%r10        # 402f44 <guest_msr+0x34>
	4c 8d 1d 06 00 00 00 	lea    0x6(%rip),%r11        # 402f4a <guest_msr+0x3a>
	0f 30                	wrmsr

With "a"/"d":

	48 8b 43 08          	mov    0x8(%rbx),%rax
	48 89 c2             	mov    %rax,%rdx
	48 c1 ea 20          	shr    $0x20,%rdx
	49 b9 ba da ca ba 0a 	movabs $0xabacadaba,%r9
	00 00 00
	4c 8d 15 07 00 00 00 	lea    0x7(%rip),%r10        # 402fc3 <guest_msr+0xb3>
	4c 8d 1d 06 00 00 00 	lea    0x6(%rip),%r11        # 402fc9 <guest_msr+0xb9>
	0f 30                	wrmsr

I was only able to find one online reference that "A" gives "eax and
edx combined into a 64-bit integer", other places don't mention it at
all.

Fixes: 3b23054cd3f5 ("KVM: selftests: Add x86-64 support for exception fixup")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/include/x86_64/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 79dcf6be1b47..3d412c578e78 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -612,7 +612,7 @@ static inline uint8_t rdmsr_safe(uint32_t msr, uint64_t *val)
 
 static inline uint8_t wrmsr_safe(uint32_t msr, uint64_t val)
 {
-	return kvm_asm_safe("wrmsr", "A"(val), "c"(msr));
+	return kvm_asm_safe("wrmsr", "a"((u32)val), "d"(val >> 32), "c"(msr));
 }
 
 uint64_t vm_get_page_table_entry(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] KVM: selftests: Test Hyper-V invariant TSC control
  2022-07-13 15:05 [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
  2022-07-13 15:05 ` [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
  2022-07-13 15:05 ` [PATCH 2/3] KVM: selftests: Fix wrmsr_safe() Vitaly Kuznetsov
@ 2022-07-13 15:05 ` Vitaly Kuznetsov
  2022-07-14  9:26   ` Maxim Levitsky
  2022-07-14  9:24 ` [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Maxim Levitsky
  3 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-13 15:05 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Maxim Levitsky, linux-kernel

Add a test for the newly introduced Hyper-V invariant TSC control feature:
- HV_X64_MSR_TSC_INVARIANT_CONTROL is not available without
 HV_ACCESS_TSC_INVARIANT CPUID bit set and available with it.
- BIT(0) of HV_X64_MSR_TSC_INVARIANT_CONTROL controls the filtering of
architectural invariant TSC (CPUID.80000007H:EDX[8]) bit.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/x86_64/hyperv_features.c    | 73 ++++++++++++++++++-
 1 file changed, 69 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
index c05acd78548f..9599eecdedff 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
@@ -15,6 +15,9 @@
 
 #define LINUX_OS_ID ((u64)0x8100 << 48)
 
+/* CPUID.80000007H:EDX */
+#define X86_FEATURE_INVTSC (1 << 8)
+
 static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
 				vm_vaddr_t output_address, uint64_t *hv_status)
 {
@@ -60,6 +63,24 @@ static void guest_msr(struct msr_data *msr)
 		GUEST_ASSERT_2(!vector, msr->idx, vector);
 	else
 		GUEST_ASSERT_2(vector == GP_VECTOR, msr->idx, vector);
+
+	/* Invariant TSC bit appears when TSC invariant control MSR is written to */
+	if (msr->idx == HV_X64_MSR_TSC_INVARIANT_CONTROL) {
+		u32 eax = 0x80000007, ebx = 0, ecx = 0, edx = 0;
+
+		cpuid(&eax, &ebx, &ecx, &edx);
+
+		/*
+		 * TSC invariant bit is present without the feature (legacy) or
+		 * when the feature is present and enabled.
+		 */
+		if ((!msr->available && !msr->write) || (msr->write && msr->write_val == 1))
+			GUEST_ASSERT(edx & X86_FEATURE_INVTSC);
+		else
+			GUEST_ASSERT(!(edx & X86_FEATURE_INVTSC));
+	}
+
+
 	GUEST_DONE();
 }
 
@@ -105,6 +126,15 @@ static void hv_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 	vcpu_set_cpuid(vcpu, cpuid);
 }
 
+static bool guest_has_invtsc(void)
+{
+	struct kvm_cpuid_entry2 *cpuid;
+
+	cpuid = kvm_get_supported_cpuid_entry(0x80000007);
+
+	return cpuid->edx & X86_FEATURE_INVTSC;
+}
+
 static void guest_test_msrs_access(void)
 {
 	struct kvm_vcpu *vcpu;
@@ -124,6 +154,7 @@ static void guest_test_msrs_access(void)
 	struct kvm_cpuid2 *best;
 	vm_vaddr_t msr_gva;
 	struct msr_data *msr;
+	bool has_invtsc = guest_has_invtsc();
 
 	while (true) {
 		vm = vm_create_with_one_vcpu(&vcpu, guest_msr);
@@ -136,8 +167,7 @@ static void guest_test_msrs_access(void)
 		vcpu_enable_cap(vcpu, KVM_CAP_HYPERV_ENFORCE_CPUID, 1);
 
 		vcpu_set_hv_cpuid(vcpu);
-
-		best = kvm_get_supported_hv_cpuid();
+		best = vcpu_get_cpuid(vcpu);
 
 		vm_init_descriptor_tables(vm);
 		vcpu_init_descriptor_tables(vcpu);
@@ -431,6 +461,42 @@ static void guest_test_msrs_access(void)
 			break;
 
 		case 44:
+			/* MSR is not available when CPUID feature bit is unset */
+			if (!has_invtsc)
+				continue;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 0;
+			msr->available = 0;
+			break;
+		case 45:
+			/* MSR is vailable when CPUID feature bit is set */
+			if (!has_invtsc)
+				continue;
+			feat.eax |= HV_ACCESS_TSC_INVARIANT;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 0;
+			msr->available = 1;
+			break;
+		case 46:
+			/* Writing bits other than 0 is forbidden */
+			if (!has_invtsc)
+				continue;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 1;
+			msr->write_val = 0xdeadbeef;
+			msr->available = 0;
+			break;
+		case 47:
+			/* Setting bit 0 enables the feature */
+			if (!has_invtsc)
+				continue;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 1;
+			msr->write_val = 1;
+			msr->available = 1;
+			break;
+
+		default:
 			kvm_vm_free(vm);
 			return;
 		}
@@ -502,8 +568,7 @@ static void guest_test_hcalls_access(void)
 		vcpu_enable_cap(vcpu, KVM_CAP_HYPERV_ENFORCE_CPUID, 1);
 
 		vcpu_set_hv_cpuid(vcpu);
-
-		best = kvm_get_supported_hv_cpuid();
+		best = vcpu_get_cpuid(vcpu);
 
 		run = vcpu->run;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] KVM: selftests: Fix wrmsr_safe()
  2022-07-13 15:05 ` [PATCH 2/3] KVM: selftests: Fix wrmsr_safe() Vitaly Kuznetsov
@ 2022-07-13 15:41   ` Sean Christopherson
  2022-07-14  0:52     ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2022-07-13 15:41 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-kernel

On Wed, Jul 13, 2022, Vitaly Kuznetsov wrote:
> It seems to be a misconception that "A" places an u64 operand to
> EAX:EDX, at least with GCC11.

It's not a misconception, it's just that the "A" trick only works for 32-bit
binaries.  For 64-bit, the 64-bit integer fits into "rax" without needing to spill
into "rdx".

I swear I had fixed this, but apparently I had only done that locally and never
pushed/posted the changes :-/

> While writing a new test, I've noticed that wrmsr_safe() tries putting
> garbage to the upper bits of the MSR, e.g.:
> 
>   kvm_exit:             reason MSR_WRITE rip 0x402919 info 0 0
>   kvm_msr:              msr_write 40000118 = 0x60000000001 (#GP)
> ...
> when it was supposed to write '1'. Apparently, "A" works the same as
> "a" and not as EAX/EDX. Here's the relevant disassembled part:
> 
> With "A":
> 
> 	48 8b 43 08          	mov    0x8(%rbx),%rax
> 	49 b9 ba da ca ba 0a 	movabs $0xabacadaba,%r9
> 	00 00 00
> 	4c 8d 15 07 00 00 00 	lea    0x7(%rip),%r10        # 402f44 <guest_msr+0x34>
> 	4c 8d 1d 06 00 00 00 	lea    0x6(%rip),%r11        # 402f4a <guest_msr+0x3a>
> 	0f 30                	wrmsr
> 
> With "a"/"d":
> 
> 	48 8b 43 08          	mov    0x8(%rbx),%rax
> 	48 89 c2             	mov    %rax,%rdx
> 	48 c1 ea 20          	shr    $0x20,%rdx
> 	49 b9 ba da ca ba 0a 	movabs $0xabacadaba,%r9
> 	00 00 00
> 	4c 8d 15 07 00 00 00 	lea    0x7(%rip),%r10        # 402fc3 <guest_msr+0xb3>
> 	4c 8d 1d 06 00 00 00 	lea    0x6(%rip),%r11        # 402fc9 <guest_msr+0xb9>
> 	0f 30                	wrmsr
> 
> I was only able to find one online reference that "A" gives "eax and
> edx combined into a 64-bit integer", other places don't mention it at
> all.
> 
> Fixes: 3b23054cd3f5 ("KVM: selftests: Add x86-64 support for exception fixup")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  tools/testing/selftests/kvm/include/x86_64/processor.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
> index 79dcf6be1b47..3d412c578e78 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/processor.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
> @@ -612,7 +612,7 @@ static inline uint8_t rdmsr_safe(uint32_t msr, uint64_t *val)
>  
>  static inline uint8_t wrmsr_safe(uint32_t msr, uint64_t val)
>  {
> -	return kvm_asm_safe("wrmsr", "A"(val), "c"(msr));
> +	return kvm_asm_safe("wrmsr", "a"((u32)val), "d"(val >> 32), "c"(msr));
>  }
>  
>  uint64_t vm_get_page_table_entry(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] KVM: selftests: Fix wrmsr_safe()
  2022-07-13 15:41   ` Sean Christopherson
@ 2022-07-14  0:52     ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2022-07-14  0:52 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-kernel

On Wed, Jul 13, 2022, Sean Christopherson wrote:
> On Wed, Jul 13, 2022, Vitaly Kuznetsov wrote:
> > It seems to be a misconception that "A" places an u64 operand to
> > EAX:EDX, at least with GCC11.
> 
> It's not a misconception, it's just that the "A" trick only works for 32-bit
> binaries.  For 64-bit, the 64-bit integer fits into "rax" without needing to spill
> into "rdx".
> 
> I swear I had fixed this, but apparently I had only done that locally and never
> pushed/posted the changes :-/

Ugh, I have a feeling I fixed RDMSR and then forgot about WRSMR.

> > While writing a new test, I've noticed that wrmsr_safe() tries putting
> > garbage to the upper bits of the MSR, e.g.:
> > 
> >   kvm_exit:             reason MSR_WRITE rip 0x402919 info 0 0
> >   kvm_msr:              msr_write 40000118 = 0x60000000001 (#GP)
> > ...
> > when it was supposed to write '1'. Apparently, "A" works the same as
> > "a" and not as EAX/EDX. Here's the relevant disassembled part:
> > 
> > With "A":
> > 
> > 	48 8b 43 08          	mov    0x8(%rbx),%rax
> > 	49 b9 ba da ca ba 0a 	movabs $0xabacadaba,%r9
> > 	00 00 00
> > 	4c 8d 15 07 00 00 00 	lea    0x7(%rip),%r10        # 402f44 <guest_msr+0x34>
> > 	4c 8d 1d 06 00 00 00 	lea    0x6(%rip),%r11        # 402f4a <guest_msr+0x3a>
> > 	0f 30                	wrmsr
> > 
> > With "a"/"d":
> > 
> > 	48 8b 43 08          	mov    0x8(%rbx),%rax
> > 	48 89 c2             	mov    %rax,%rdx
> > 	48 c1 ea 20          	shr    $0x20,%rdx

Huh.  This is wrong.  RAX is loaded with the full 64-bit value.  It doesn't matter
for WRMSR because WRMSR only consumes EAX, but it's wrong.  I can't for the life
of me figure out why casting to a u32 doesn't force the compiler to truncate the
value.  Truncation in other places most definitely works, and the compiler loads
only EAX and EDX when using a hardcoded value, e.g. -1ull, so the input isn't
messed up.  There's no 32-bit loads of EAX, so no implicit truncation of RAX[63:32].

gcc-{7,9,11} and clang-13 generate the same code, so either it's a really
longstanding bug, or maybe some funky undocumented behavior?

If I use

	return kvm_asm_safe("wrmsr", "a"(val & -1u), "d"(val >> 32), "c"(msr));

then the result is as expected:

  48 8b 53 08             mov    0x8(%rbx),%rdx
  89 d0                   mov    %edx,%eax
  48 c1 ea 20             shr    $0x20,%rdx

I'll post a v2 of just this patch on your behalf, I also reworded the changelog
to include the gcc documentation that talks about the behavior of "A".

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature
  2022-07-13 15:05 [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2022-07-13 15:05 ` [PATCH 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
@ 2022-07-14  9:24 ` Maxim Levitsky
  2022-07-14 15:02   ` Vitaly Kuznetsov
  3 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2022-07-14  9:24 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, linux-kernel

On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the 
> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.

If I understood the feature correctly from the code, it allows the HyperV, or in this
case KVM acting as HyperV, to avoid unconditionally exposing the invltsc bit
in CPUID, but rather let the guest know that it can opt-in into this,
by giving the guest another CPUID bit to indicate this ability
and a MSR which the guest uses to opt-in.

Are there known use cases of this, are there guests which won't opt-in?

> 
> Note: strictly speaking, KVM doesn't have to have the feature as exposing
> raw invariant TSC bit (CPUID.80000007H:EDX[8]) also seems to work for
> modern Windows versions. The feature is, however, tiny and straitforward
> and gives additional flexibility so why not.

This means that KVM can also just unconditionally expose the invtsc bit
to the guest, and the guest still uses it.


Nitpick: It might be worth it to document it a bit better somewhere,
as I tried to do in this mail.


Best regards,
	Maxim Levitsky

> 
> Vitaly Kuznetsov (3):
>   KVM: x86: Hyper-V invariant TSC control
>   KVM: selftests: Fix wrmsr_safe()
>   KVM: selftests: Test Hyper-V invariant TSC control
> 
>  arch/x86/include/asm/kvm_host.h               |  1 +
>  arch/x86/kvm/cpuid.c                          |  7 ++
>  arch/x86/kvm/hyperv.c                         | 19 +++++
>  arch/x86/kvm/hyperv.h                         | 15 ++++
>  arch/x86/kvm/x86.c                            |  4 +-
>  .../selftests/kvm/include/x86_64/processor.h  |  2 +-
>  .../selftests/kvm/x86_64/hyperv_features.c    | 73 ++++++++++++++++++-
>  7 files changed, 115 insertions(+), 6 deletions(-)
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-07-13 15:05 ` [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
@ 2022-07-14  9:25   ` Maxim Levitsky
  2022-07-14 15:05     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2022-07-14  9:25 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, linux-kernel

On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
> 
> Add the feature to KVM. Keep CPUID output intact when the feature
> wasn't exposed to L1 and implement the required logic for hiding
> invariant TSC when the feature was exposed and invariant TSC control
> MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
> disable the feature once it was enabled.
> 
> For the reference, for linux guests, support for the feature was added
> in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/cpuid.c            |  7 +++++++
>  arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
>  arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
>  arch/x86/kvm/x86.c              |  4 +++-
>  5 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index de5a149d0971..88553f0b524c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1022,6 +1022,7 @@ struct kvm_hv {
>         u64 hv_reenlightenment_control;
>         u64 hv_tsc_emulation_control;
>         u64 hv_tsc_emulation_status;
> +       u64 hv_invtsc;
>  
>         /* How many vCPUs have VP index != vCPU index */
>         atomic_t num_mismatched_vp_indexes;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index d47222ab8e6e..788df2eb1ec4 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1404,6 +1404,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>                             (data & TSX_CTRL_CPUID_CLEAR))
>                                 *ebx &= ~(F(RTM) | F(HLE));
>                 }

Tiny nitpick: Maybe add a bit longer comment about this thing, like that guest needs to opt-in
to see invtsc when it has the HV feature exposed to it, 
I don't have a strong preference about this though.

> +               /*
> +                * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
> +                * guests if needed.
> +                */
> +               if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))
> +                       *edx &= ~(1 << 8);

> +
>         } else {
>                 *eax = *ebx = *ecx = *edx = 0;
>                 /*
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index e2e95a6fccfd..0d8e6526a839 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>         case HV_X64_MSR_TSC_EMULATION_STATUS:
> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>         case HV_X64_MSR_SYNDBG_OPTIONS:
>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>                 r = true;
> @@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>                 return hv_vcpu->cpuid_cache.features_eax &
>                         HV_ACCESS_REENLIGHTENMENT;
> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +               return hv_vcpu->cpuid_cache.features_eax &
> +                       HV_ACCESS_TSC_INVARIANT;
>         case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
>         case HV_X64_MSR_CRASH_CTL:
>                 return hv_vcpu->cpuid_cache.features_edx &
> @@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
>                 if (!host)
>                         return 1;
>                 break;
> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +               /* Only bit 0 is supported */
> +               if (data & ~BIT_ULL(0))
> +                       return 1;
> +
> +               /* The feature can't be disabled from the guest */
> +               if (!host && hv->hv_invtsc && !data)
> +                       return 1;

The unit test in patch 3 claims, that this msr should #GP when 'invtsc'
aka bit 8 of edx of leaf 0x80000007 is not enabled by the hypervisor in the guest cpuid.

Yet, looking at the code I think that this msr read/write access only depends on
the 'new' cpuid bit, aka the HV_ACCESS_TSC_INVARIANT, thus this msr will 'work'
but do nothing if 'invtsc' is not exposed (it will then not turn it on).



> +
> +               hv->hv_invtsc = data;
> +               break;
>         case HV_X64_MSR_SYNDBG_OPTIONS:
>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>                 return syndbg_set_msr(vcpu, msr, data, host);
> @@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>                 data = hv->hv_tsc_emulation_status;
>                 break;
> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +               data = hv->hv_invtsc;
> +               break;
>         case HV_X64_MSR_SYNDBG_OPTIONS:
>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>                 return syndbg_get_msr(vcpu, msr, pdata, host);
> @@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>                         ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>                         ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
>                         ent->eax |= HV_ACCESS_REENLIGHTENMENT;
> +                       ent->eax |= HV_ACCESS_TSC_INVARIANT;
>  
>                         ent->ebx |= HV_POST_MESSAGES;
>                         ent->ebx |= HV_SIGNAL_EVENTS;
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index da2737f2a956..1a6316ab55eb 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
>                              HV_SYNIC_STIMER_COUNT);
>  }
>  
> +/*
> + * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
> + * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
> + */
> +static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
> +{
> +       struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> +       struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
> +
> +       if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)
> +               return !hv->hv_invtsc;
> +
> +       return false;
> +}
> +
>  void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
>  
>  void kvm_hv_setup_tsc_page(struct kvm *kvm,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 567d13405445..322e0a544823 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1466,7 +1466,7 @@ static const u32 emulated_msrs_all[] = {
>         HV_X64_MSR_STIMER0_CONFIG,
>         HV_X64_MSR_VP_ASSIST_PAGE,
>         HV_X64_MSR_REENLIGHTENMENT_CONTROL, HV_X64_MSR_TSC_EMULATION_CONTROL,
> -       HV_X64_MSR_TSC_EMULATION_STATUS,
> +       HV_X64_MSR_TSC_EMULATION_STATUS, HV_X64_MSR_TSC_INVARIANT_CONTROL,
>         HV_X64_MSR_SYNDBG_OPTIONS,
>         HV_X64_MSR_SYNDBG_CONTROL, HV_X64_MSR_SYNDBG_STATUS,
>         HV_X64_MSR_SYNDBG_SEND_BUFFER, HV_X64_MSR_SYNDBG_RECV_BUFFER,
> @@ -3769,6 +3769,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>         case HV_X64_MSR_TSC_EMULATION_STATUS:
> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>                 return kvm_hv_set_msr_common(vcpu, msr, data,
>                                              msr_info->host_initiated);
>         case MSR_IA32_BBL_CR_CTL3:
> @@ -4139,6 +4140,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>         case HV_X64_MSR_TSC_EMULATION_STATUS:
> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>                 return kvm_hv_get_msr_common(vcpu,
>                                              msr_info->index, &msr_info->data,
>                                              msr_info->host_initiated);


Beware that this new msr also will need to be migrated by qemu, 
when the feature is added to qemu -
I had my own share of fun with AMD's TSC ratio msr when I implemented it
(had to fix it twice in qemu :( ...)

Best regards,
	Maxim Levitrsky


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] KVM: selftests: Test Hyper-V invariant TSC control
  2022-07-13 15:05 ` [PATCH 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
@ 2022-07-14  9:26   ` Maxim Levitsky
  2022-07-14 14:57     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2022-07-14  9:26 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, linux-kernel

On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
> Add a test for the newly introduced Hyper-V invariant TSC control feature:
> - HV_X64_MSR_TSC_INVARIANT_CONTROL is not available without
>  HV_ACCESS_TSC_INVARIANT CPUID bit set and available with it.
> - BIT(0) of HV_X64_MSR_TSC_INVARIANT_CONTROL controls the filtering of
> architectural invariant TSC (CPUID.80000007H:EDX[8]) bit.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  .../selftests/kvm/x86_64/hyperv_features.c    | 73 ++++++++++++++++++-
>  1 file changed, 69 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> index c05acd78548f..9599eecdedff 100644
> --- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> @@ -15,6 +15,9 @@
>  
>  #define LINUX_OS_ID ((u64)0x8100 << 48)
>  
> +/* CPUID.80000007H:EDX */
> +#define X86_FEATURE_INVTSC (1 << 8)
> +
>  static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
>                                 vm_vaddr_t output_address, uint64_t *hv_status)
>  {
> @@ -60,6 +63,24 @@ static void guest_msr(struct msr_data *msr)
>                 GUEST_ASSERT_2(!vector, msr->idx, vector);
>         else
>                 GUEST_ASSERT_2(vector == GP_VECTOR, msr->idx, vector);
> +
> +       /* Invariant TSC bit appears when TSC invariant control MSR is written to */
> +       if (msr->idx == HV_X64_MSR_TSC_INVARIANT_CONTROL) {
> +               u32 eax = 0x80000007, ebx = 0, ecx = 0, edx = 0;
> +
> +               cpuid(&eax, &ebx, &ecx, &edx);
> +
> +               /*
> +                * TSC invariant bit is present without the feature (legacy) or
> +                * when the feature is present and enabled.
> +                */
> +               if ((!msr->available && !msr->write) || (msr->write && msr->write_val == 1))
> +                       GUEST_ASSERT(edx & X86_FEATURE_INVTSC);
> +               else
> +                       GUEST_ASSERT(!(edx & X86_FEATURE_INVTSC));
> +       }
> +
> +
>         GUEST_DONE();
>  }
>  
> @@ -105,6 +126,15 @@ static void hv_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>         vcpu_set_cpuid(vcpu, cpuid);
>  }
>  
> +static bool guest_has_invtsc(void)
> +{
> +       struct kvm_cpuid_entry2 *cpuid;
> +
> +       cpuid = kvm_get_supported_cpuid_entry(0x80000007);
> +
> +       return cpuid->edx & X86_FEATURE_INVTSC;
> +}
> +
>  static void guest_test_msrs_access(void)
>  {
>         struct kvm_vcpu *vcpu;
> @@ -124,6 +154,7 @@ static void guest_test_msrs_access(void)
>         struct kvm_cpuid2 *best;
>         vm_vaddr_t msr_gva;
>         struct msr_data *msr;
> +       bool has_invtsc = guest_has_invtsc();
>  
>         while (true) {
>                 vm = vm_create_with_one_vcpu(&vcpu, guest_msr);
> @@ -136,8 +167,7 @@ static void guest_test_msrs_access(void)
>                 vcpu_enable_cap(vcpu, KVM_CAP_HYPERV_ENFORCE_CPUID, 1);
>  
>                 vcpu_set_hv_cpuid(vcpu);
> -
> -               best = kvm_get_supported_hv_cpuid();
> +               best = vcpu_get_cpuid(vcpu);
>  
>                 vm_init_descriptor_tables(vm);
>                 vcpu_init_descriptor_tables(vcpu);
> @@ -431,6 +461,42 @@ static void guest_test_msrs_access(void)
>                         break;
>  
>                 case 44:
> +                       /* MSR is not available when CPUID feature bit is unset */
> +                       if (!has_invtsc)
> +                               continue;
> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
> +                       msr->write = 0;
> +                       msr->available = 0;
> +                       break;
> +               case 45:
> +                       /* MSR is vailable when CPUID feature bit is set */
> +                       if (!has_invtsc)
> +                               continue;
> +                       feat.eax |= HV_ACCESS_TSC_INVARIANT;
> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
> +                       msr->write = 0;
> +                       msr->available = 1;
> +                       break;
> +               case 46:
> +                       /* Writing bits other than 0 is forbidden */
> +                       if (!has_invtsc)
> +                               continue;
> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
> +                       msr->write = 1;
> +                       msr->write_val = 0xdeadbeef;
> +                       msr->available = 0;
> +                       break;
> +               case 47:
> +                       /* Setting bit 0 enables the feature */
> +                       if (!has_invtsc)
> +                               continue;
> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
> +                       msr->write = 1;
> +                       msr->write_val = 1;
> +                       msr->available = 1;
> +                       break;
> +
> +               default:
>                         kvm_vm_free(vm);
>                         return;
>                 }
> @@ -502,8 +568,7 @@ static void guest_test_hcalls_access(void)
>                 vcpu_enable_cap(vcpu, KVM_CAP_HYPERV_ENFORCE_CPUID, 1);
>  
>                 vcpu_set_hv_cpuid(vcpu);
> -
> -               best = kvm_get_supported_hv_cpuid();
> +               best = vcpu_get_cpuid(vcpu);
>  
>                 run = vcpu->run;
>  

Tiny unrelated nitpick: 'msr->available' is misleading, it is more like
'msr->should_not_gp' or something  - might be worth it to refactor in the future.


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] KVM: selftests: Test Hyper-V invariant TSC control
  2022-07-14  9:26   ` Maxim Levitsky
@ 2022-07-14 14:57     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14 14:57 UTC (permalink / raw)
  To: Maxim Levitsky, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, linux-kernel

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
>> Add a test for the newly introduced Hyper-V invariant TSC control feature:
>> - HV_X64_MSR_TSC_INVARIANT_CONTROL is not available without
>>  HV_ACCESS_TSC_INVARIANT CPUID bit set and available with it.
>> - BIT(0) of HV_X64_MSR_TSC_INVARIANT_CONTROL controls the filtering of
>> architectural invariant TSC (CPUID.80000007H:EDX[8]) bit.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  .../selftests/kvm/x86_64/hyperv_features.c    | 73 ++++++++++++++++++-
>>  1 file changed, 69 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
>> index c05acd78548f..9599eecdedff 100644
>> --- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
>> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
>> @@ -15,6 +15,9 @@
>>  
>>  #define LINUX_OS_ID ((u64)0x8100 << 48)
>>  
>> +/* CPUID.80000007H:EDX */
>> +#define X86_FEATURE_INVTSC (1 << 8)
>> +
>>  static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
>>                                 vm_vaddr_t output_address, uint64_t *hv_status)
>>  {
>> @@ -60,6 +63,24 @@ static void guest_msr(struct msr_data *msr)
>>                 GUEST_ASSERT_2(!vector, msr->idx, vector);
>>         else
>>                 GUEST_ASSERT_2(vector == GP_VECTOR, msr->idx, vector);
>> +
>> +       /* Invariant TSC bit appears when TSC invariant control MSR is written to */
>> +       if (msr->idx == HV_X64_MSR_TSC_INVARIANT_CONTROL) {
>> +               u32 eax = 0x80000007, ebx = 0, ecx = 0, edx = 0;
>> +
>> +               cpuid(&eax, &ebx, &ecx, &edx);
>> +
>> +               /*
>> +                * TSC invariant bit is present without the feature (legacy) or
>> +                * when the feature is present and enabled.
>> +                */
>> +               if ((!msr->available && !msr->write) || (msr->write && msr->write_val == 1))
>> +                       GUEST_ASSERT(edx & X86_FEATURE_INVTSC);
>> +               else
>> +                       GUEST_ASSERT(!(edx & X86_FEATURE_INVTSC));
>> +       }
>> +
>> +
>>         GUEST_DONE();
>>  }
>>  
>> @@ -105,6 +126,15 @@ static void hv_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>         vcpu_set_cpuid(vcpu, cpuid);
>>  }
>>  
>> +static bool guest_has_invtsc(void)
>> +{
>> +       struct kvm_cpuid_entry2 *cpuid;
>> +
>> +       cpuid = kvm_get_supported_cpuid_entry(0x80000007);
>> +
>> +       return cpuid->edx & X86_FEATURE_INVTSC;
>> +}
>> +
>>  static void guest_test_msrs_access(void)
>>  {
>>         struct kvm_vcpu *vcpu;
>> @@ -124,6 +154,7 @@ static void guest_test_msrs_access(void)
>>         struct kvm_cpuid2 *best;
>>         vm_vaddr_t msr_gva;
>>         struct msr_data *msr;
>> +       bool has_invtsc = guest_has_invtsc();
>>  
>>         while (true) {
>>                 vm = vm_create_with_one_vcpu(&vcpu, guest_msr);
>> @@ -136,8 +167,7 @@ static void guest_test_msrs_access(void)
>>                 vcpu_enable_cap(vcpu, KVM_CAP_HYPERV_ENFORCE_CPUID, 1);
>>  
>>                 vcpu_set_hv_cpuid(vcpu);
>> -
>> -               best = kvm_get_supported_hv_cpuid();
>> +               best = vcpu_get_cpuid(vcpu);
>>  
>>                 vm_init_descriptor_tables(vm);
>>                 vcpu_init_descriptor_tables(vcpu);
>> @@ -431,6 +461,42 @@ static void guest_test_msrs_access(void)
>>                         break;
>>  
>>                 case 44:
>> +                       /* MSR is not available when CPUID feature bit is unset */
>> +                       if (!has_invtsc)
>> +                               continue;
>> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
>> +                       msr->write = 0;
>> +                       msr->available = 0;
>> +                       break;
>> +               case 45:
>> +                       /* MSR is vailable when CPUID feature bit is set */
>> +                       if (!has_invtsc)
>> +                               continue;
>> +                       feat.eax |= HV_ACCESS_TSC_INVARIANT;
>> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
>> +                       msr->write = 0;
>> +                       msr->available = 1;
>> +                       break;
>> +               case 46:
>> +                       /* Writing bits other than 0 is forbidden */
>> +                       if (!has_invtsc)
>> +                               continue;
>> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
>> +                       msr->write = 1;
>> +                       msr->write_val = 0xdeadbeef;
>> +                       msr->available = 0;
>> +                       break;
>> +               case 47:
>> +                       /* Setting bit 0 enables the feature */
>> +                       if (!has_invtsc)
>> +                               continue;
>> +                       msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
>> +                       msr->write = 1;
>> +                       msr->write_val = 1;
>> +                       msr->available = 1;
>> +                       break;
>> +
>> +               default:
>>                         kvm_vm_free(vm);
>>                         return;
>>                 }
>> @@ -502,8 +568,7 @@ static void guest_test_hcalls_access(void)
>>                 vcpu_enable_cap(vcpu, KVM_CAP_HYPERV_ENFORCE_CPUID, 1);
>>  
>>                 vcpu_set_hv_cpuid(vcpu);
>> -
>> -               best = kvm_get_supported_hv_cpuid();
>> +               best = vcpu_get_cpuid(vcpu);
>>  
>>                 run = vcpu->run;
>>  
>
> Tiny unrelated nitpick: 'msr->available' is misleading, it is more like
> 'msr->should_not_gp' or something  - might be worth it to refactor in the future.
>

Indeed, sounds much better. I'll add a renaming patch when doing v2.

>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Thanks!

>
> Best regards,
> 	Maxim Levitsky
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature
  2022-07-14  9:24 ` [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Maxim Levitsky
@ 2022-07-14 15:02   ` Vitaly Kuznetsov
  2022-07-18 15:07     ` Michael Kelley (LINUX)
  0 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14 15:02 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: Wanpeng Li, Jim Mattson, linux-kernel, Paolo Bonzini,
	Sean Christopherson

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
>> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
>> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
>> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
>> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
>> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the 
>> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
>
> If I understood the feature correctly from the code, it allows the HyperV, or in this
> case KVM acting as HyperV, to avoid unconditionally exposing the invltsc bit
> in CPUID, but rather let the guest know that it can opt-in into this,
> by giving the guest another CPUID bit to indicate this ability
> and a MSR which the guest uses to opt-in.
>
> Are there known use cases of this, are there guests which won't opt-in?
>

Linux prior to dce7cd62754b and some older Windows guests I guess.

>> 
>> Note: strictly speaking, KVM doesn't have to have the feature as exposing
>> raw invariant TSC bit (CPUID.80000007H:EDX[8]) also seems to work for
>> modern Windows versions. The feature is, however, tiny and straitforward
>> and gives additional flexibility so why not.
>
> This means that KVM can also just unconditionally expose the invtsc bit
> to the guest, and the guest still uses it.

Yes, this feature doesn't bring much by itself (at least with modern
Windows versions). I've implemented it while debugging what ended up
being 
https://lore.kernel.org/kvm/20220712135009.952805-1-vkuznets@redhat.com/
(so the issue wasn't enlightenments related after all) but as I think it
may come handy some day so why keeping it in my private stash.

>
>
> Nitpick: It might be worth it to document it a bit better somewhere,
> as I tried to do in this mail.

TLFS sounds like the right place for it but ... it's not there... oh well.

>
>
> Best regards,
> 	Maxim Levitsky
>
>> 
>> Vitaly Kuznetsov (3):
>>   KVM: x86: Hyper-V invariant TSC control
>>   KVM: selftests: Fix wrmsr_safe()
>>   KVM: selftests: Test Hyper-V invariant TSC control
>> 
>>  arch/x86/include/asm/kvm_host.h               |  1 +
>>  arch/x86/kvm/cpuid.c                          |  7 ++
>>  arch/x86/kvm/hyperv.c                         | 19 +++++
>>  arch/x86/kvm/hyperv.h                         | 15 ++++
>>  arch/x86/kvm/x86.c                            |  4 +-
>>  .../selftests/kvm/include/x86_64/processor.h  |  2 +-
>>  .../selftests/kvm/x86_64/hyperv_features.c    | 73 ++++++++++++++++++-
>>  7 files changed, 115 insertions(+), 6 deletions(-)
>> 
>
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-07-14  9:25   ` Maxim Levitsky
@ 2022-07-14 15:05     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14 15:05 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: Wanpeng Li, Jim Mattson, linux-kernel, Paolo Bonzini,
	Sean Christopherson

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
>> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
>> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
>> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
>> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
>> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
>> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
>> 
>> Add the feature to KVM. Keep CPUID output intact when the feature
>> wasn't exposed to L1 and implement the required logic for hiding
>> invariant TSC when the feature was exposed and invariant TSC control
>> MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
>> disable the feature once it was enabled.
>> 
>> For the reference, for linux guests, support for the feature was added
>> in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/include/asm/kvm_host.h |  1 +
>>  arch/x86/kvm/cpuid.c            |  7 +++++++
>>  arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
>>  arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
>>  arch/x86/kvm/x86.c              |  4 +++-
>>  5 files changed, 45 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index de5a149d0971..88553f0b524c 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1022,6 +1022,7 @@ struct kvm_hv {
>>         u64 hv_reenlightenment_control;
>>         u64 hv_tsc_emulation_control;
>>         u64 hv_tsc_emulation_status;
>> +       u64 hv_invtsc;
>>  
>>         /* How many vCPUs have VP index != vCPU index */
>>         atomic_t num_mismatched_vp_indexes;
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index d47222ab8e6e..788df2eb1ec4 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -1404,6 +1404,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>                             (data & TSX_CTRL_CPUID_CLEAR))
>>                                 *ebx &= ~(F(RTM) | F(HLE));
>>                 }
>
> Tiny nitpick: Maybe add a bit longer comment about this thing, like that guest needs to opt-in
> to see invtsc when it has the HV feature exposed to it, 
> I don't have a strong preference about this though.
>
>> +               /*
>> +                * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
>> +                * guests if needed.
>> +                */
>> +               if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))
>> +                       *edx &= ~(1 << 8);
>
>> +
>>         } else {
>>                 *eax = *ebx = *ecx = *edx = 0;
>>                 /*
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index e2e95a6fccfd..0d8e6526a839 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
>>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>>         case HV_X64_MSR_SYNDBG_OPTIONS:
>>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>                 r = true;
>> @@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>>                 return hv_vcpu->cpuid_cache.features_eax &
>>                         HV_ACCESS_REENLIGHTENMENT;
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +               return hv_vcpu->cpuid_cache.features_eax &
>> +                       HV_ACCESS_TSC_INVARIANT;
>>         case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
>>         case HV_X64_MSR_CRASH_CTL:
>>                 return hv_vcpu->cpuid_cache.features_edx &
>> @@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
>>                 if (!host)
>>                         return 1;
>>                 break;
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +               /* Only bit 0 is supported */
>> +               if (data & ~BIT_ULL(0))
>> +                       return 1;
>> +
>> +               /* The feature can't be disabled from the guest */
>> +               if (!host && hv->hv_invtsc && !data)
>> +                       return 1;
>
> The unit test in patch 3 claims, that this msr should #GP when 'invtsc'
> aka bit 8 of edx of leaf 0x80000007 is not enabled by the hypervisor
> in the guest cpuid.
>

No, we don't GP when architectural InvTsc is not there....

> Yet, looking at the code I think that this msr read/write access only depends on
> the 'new' cpuid bit, aka the HV_ACCESS_TSC_INVARIANT, thus this msr will 'work'
> but do nothing if 'invtsc' is not exposed (it will then not turn it on).

... as this PV feature just enabled and disables filtering. If
architectural InvTsc is not there then filtering just changes
nothing. VMM is supposed to not enable this without InvTsc itself as
it's kind of pointless.

>
>
>
>> +
>> +               hv->hv_invtsc = data;
>> +               break;
>>         case HV_X64_MSR_SYNDBG_OPTIONS:
>>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>                 return syndbg_set_msr(vcpu, msr, data, host);
>> @@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>>                 data = hv->hv_tsc_emulation_status;
>>                 break;
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +               data = hv->hv_invtsc;
>> +               break;
>>         case HV_X64_MSR_SYNDBG_OPTIONS:
>>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>                 return syndbg_get_msr(vcpu, msr, pdata, host);
>> @@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>                         ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>>                         ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
>>                         ent->eax |= HV_ACCESS_REENLIGHTENMENT;
>> +                       ent->eax |= HV_ACCESS_TSC_INVARIANT;
>>  
>>                         ent->ebx |= HV_POST_MESSAGES;
>>                         ent->ebx |= HV_SIGNAL_EVENTS;
>> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
>> index da2737f2a956..1a6316ab55eb 100644
>> --- a/arch/x86/kvm/hyperv.h
>> +++ b/arch/x86/kvm/hyperv.h
>> @@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
>>                              HV_SYNIC_STIMER_COUNT);
>>  }
>>  
>> +/*
>> + * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
>> + * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
>> + */
>> +static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
>> +{
>> +       struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +       struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
>> +
>> +       if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)
>> +               return !hv->hv_invtsc;
>> +
>> +       return false;
>> +}
>> +
>>  void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
>>  
>>  void kvm_hv_setup_tsc_page(struct kvm *kvm,
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 567d13405445..322e0a544823 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1466,7 +1466,7 @@ static const u32 emulated_msrs_all[] = {
>>         HV_X64_MSR_STIMER0_CONFIG,
>>         HV_X64_MSR_VP_ASSIST_PAGE,
>>         HV_X64_MSR_REENLIGHTENMENT_CONTROL, HV_X64_MSR_TSC_EMULATION_CONTROL,
>> -       HV_X64_MSR_TSC_EMULATION_STATUS,
>> +       HV_X64_MSR_TSC_EMULATION_STATUS, HV_X64_MSR_TSC_INVARIANT_CONTROL,
>>         HV_X64_MSR_SYNDBG_OPTIONS,
>>         HV_X64_MSR_SYNDBG_CONTROL, HV_X64_MSR_SYNDBG_STATUS,
>>         HV_X64_MSR_SYNDBG_SEND_BUFFER, HV_X64_MSR_SYNDBG_RECV_BUFFER,
>> @@ -3769,6 +3769,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>>                 return kvm_hv_set_msr_common(vcpu, msr, data,
>>                                              msr_info->host_initiated);
>>         case MSR_IA32_BBL_CR_CTL3:
>> @@ -4139,6 +4140,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>>                 return kvm_hv_get_msr_common(vcpu,
>>                                              msr_info->index, &msr_info->data,
>>                                              msr_info->host_initiated);
>
>
> Beware that this new msr also will need to be migrated by qemu, 
> when the feature is added to qemu -
> I had my own share of fun with AMD's TSC ratio msr when I implemented it
> (had to fix it twice in qemu :( ...)

Yes, we do this for a numbet of Hyper-V MSRs already
(e.g. reenlightnment MSRs).

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature
  2022-07-14 15:02   ` Vitaly Kuznetsov
@ 2022-07-18 15:07     ` Michael Kelley (LINUX)
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Kelley (LINUX) @ 2022-07-18 15:07 UTC (permalink / raw)
  To: vkuznets, Maxim Levitsky, kvm
  Cc: Wanpeng Li, Jim Mattson, linux-kernel, Paolo Bonzini,
	Sean Christopherson

From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Thursday, July 14, 2022 8:03 AM
> 
> Maxim Levitsky <mlevitsk@redhat.com> writes:
> 
> > On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
> >> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
> >> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
> >> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
> >> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
> >> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
> >> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
> >
> > If I understood the feature correctly from the code, it allows the HyperV, or in this
> > case KVM acting as HyperV, to avoid unconditionally exposing the invltsc bit
> > in CPUID, but rather let the guest know that it can opt-in into this,
> > by giving the guest another CPUID bit to indicate this ability
> > and a MSR which the guest uses to opt-in.
> >
> > Are there known use cases of this, are there guests which won't opt-in?
> >
> 
> Linux prior to dce7cd62754b and some older Windows guests I guess.

FWIW, the idea is to avoid having this new functionality magically show
up in existing guests when the Hyper-V host is upgraded.  A guest OS
version with the code to opt-in has presumably been tested to make sure
it works correctly when the functionality is exposed.

> 
> >>
> >> Note: strictly speaking, KVM doesn't have to have the feature as exposing
> >> raw invariant TSC bit (CPUID.80000007H:EDX[8]) also seems to work for
> >> modern Windows versions. The feature is, however, tiny and straitforward
> >> and gives additional flexibility so why not.
> >
> > This means that KVM can also just unconditionally expose the invtsc bit
> > to the guest, and the guest still uses it.
> 
> Yes, this feature doesn't bring much by itself (at least with modern
> Windows versions). I've implemented it while debugging what ended up
> being https://lore.kernel.org/kvm/20220712135009.952805-1-vkuznets@redhat.com/ 
> (so the issue wasn't enlightenments related after all) but as I think it
> may come handy some day so why keeping it in my private stash.
> 
> >
> > Nitpick: It might be worth it to document it a bit better somewhere,
> > as I tried to do in this mail.
> 
> TLFS sounds like the right place for it but ... it's not there... oh well.
> 

I've sent a nag email to the Hyper-V folks about updating the TLFS.

Michael

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-18 15:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-13 15:05 [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
2022-07-13 15:05 ` [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
2022-07-14  9:25   ` Maxim Levitsky
2022-07-14 15:05     ` Vitaly Kuznetsov
2022-07-13 15:05 ` [PATCH 2/3] KVM: selftests: Fix wrmsr_safe() Vitaly Kuznetsov
2022-07-13 15:41   ` Sean Christopherson
2022-07-14  0:52     ` Sean Christopherson
2022-07-13 15:05 ` [PATCH 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
2022-07-14  9:26   ` Maxim Levitsky
2022-07-14 14:57     ` Vitaly Kuznetsov
2022-07-14  9:24 ` [PATCH 0/3] KVM: x86: Hyper-V invariant TSC control feature Maxim Levitsky
2022-07-14 15:02   ` Vitaly Kuznetsov
2022-07-18 15:07     ` Michael Kelley (LINUX)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).