linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] KVM: x86: Hyper-V invariant TSC control feature
@ 2022-08-31  8:50 Vitaly Kuznetsov
  2022-08-31  8:50 ` [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-31  8:50 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Michael Kelley, Maxim Levitsky,
	linux-hyperv, linux-kernel

Changes since v1:
- Drop "KVM: selftests: Fix wrmsr_safe()" patch as it is already merged.
- Add 'KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in
 hyperv_features test' patch [Max]
- Rebase selftest to the current API.

Original description:

Normally, genuine Hyper-V doesn't expose architectural invariant TSC
(CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
(HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
PV MSR is set, invariant TSC bit starts to show up in CPUID. When the 
feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.

Note: strictly speaking, KVM doesn't have to have the feature as exposing
raw invariant TSC bit (CPUID.80000007H:EDX[8]) also seems to work for
modern Windows versions. The feature is, however, tiny and straitforward
and gives additional flexibility so why not.

Vitaly Kuznetsov (3):
  KVM: x86: Hyper-V invariant TSC control
  KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in
    hyperv_features test
  KVM: selftests: Test Hyper-V invariant TSC control

 arch/x86/include/asm/kvm_host.h               |   1 +
 arch/x86/kvm/cpuid.c                          |   7 +
 arch/x86/kvm/hyperv.c                         |  19 +++
 arch/x86/kvm/hyperv.h                         |  15 ++
 arch/x86/kvm/x86.c                            |   4 +-
 .../selftests/kvm/x86_64/hyperv_features.c    | 159 +++++++++++++-----
 6 files changed, 158 insertions(+), 47 deletions(-)

-- 
2.37.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-08-31  8:50 [PATCH v2 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
@ 2022-08-31  8:50 ` Vitaly Kuznetsov
  2022-08-31 10:43   ` Maxim Levitsky
  2022-09-12 13:51   ` Sean Christopherson
  2022-08-31  8:50 ` [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test Vitaly Kuznetsov
  2022-08-31  8:50 ` [PATCH v2 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
  2 siblings, 2 replies; 12+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-31  8:50 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Michael Kelley, Maxim Levitsky,
	linux-hyperv, linux-kernel

Normally, genuine Hyper-V doesn't expose architectural invariant TSC
(CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
(HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.

Add the feature to KVM. Keep CPUID output intact when the feature
wasn't exposed to L1 and implement the required logic for hiding
invariant TSC when the feature was exposed and invariant TSC control
MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
disable the feature once it was enabled.

For the reference, for linux guests, support for the feature was added
in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            |  7 +++++++
 arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
 arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
 arch/x86/kvm/x86.c              |  4 +++-
 5 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2c96c43c313a..9098187e13aa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1021,6 +1021,7 @@ struct kvm_hv {
 	u64 hv_reenlightenment_control;
 	u64 hv_tsc_emulation_control;
 	u64 hv_tsc_emulation_status;
+	u64 hv_invtsc;
 
 	/* How many vCPUs have VP index != vCPU index */
 	atomic_t num_mismatched_vp_indexes;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 75dcf7a72605..8ccd45fd66a9 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1444,6 +1444,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 			    (data & TSX_CTRL_CPUID_CLEAR))
 				*ebx &= ~(F(RTM) | F(HLE));
 		}
+		/*
+		 * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
+		 * guests if needed.
+		 */
+		if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))
+			*edx &= ~(1 << 8);
+
 	} else {
 		*eax = *ebx = *ecx = *edx = 0;
 		/*
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index ed804447589c..df90cd7501b9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
 	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
 	case HV_X64_MSR_SYNDBG_OPTIONS:
 	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
 		r = true;
@@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
 		return hv_vcpu->cpuid_cache.features_eax &
 			HV_ACCESS_REENLIGHTENMENT;
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
+		return hv_vcpu->cpuid_cache.features_eax &
+			HV_ACCESS_TSC_INVARIANT;
 	case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
 	case HV_X64_MSR_CRASH_CTL:
 		return hv_vcpu->cpuid_cache.features_edx &
@@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
 		if (!host)
 			return 1;
 		break;
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
+		/* Only bit 0 is supported */
+		if (data & ~BIT_ULL(0))
+			return 1;
+
+		/* The feature can't be disabled from the guest */
+		if (!host && hv->hv_invtsc && !data)
+			return 1;
+
+		hv->hv_invtsc = data;
+		break;
 	case HV_X64_MSR_SYNDBG_OPTIONS:
 	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
 		return syndbg_set_msr(vcpu, msr, data, host);
@@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
 		data = hv->hv_tsc_emulation_status;
 		break;
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
+		data = hv->hv_invtsc;
+		break;
 	case HV_X64_MSR_SYNDBG_OPTIONS:
 	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
 		return syndbg_get_msr(vcpu, msr, pdata, host);
@@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
 			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
 			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
+			ent->eax |= HV_ACCESS_TSC_INVARIANT;
 
 			ent->ebx |= HV_POST_MESSAGES;
 			ent->ebx |= HV_SIGNAL_EVENTS;
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index da2737f2a956..1a6316ab55eb 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
 			     HV_SYNIC_STIMER_COUNT);
 }
 
+/*
+ * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
+ * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
+ */
+static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
+
+	if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)
+		return !hv->hv_invtsc;
+
+	return false;
+}
+
 void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
 
 void kvm_hv_setup_tsc_page(struct kvm *kvm,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d7374d768296..ad429800f9b5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1471,7 +1471,7 @@ static const u32 emulated_msrs_all[] = {
 	HV_X64_MSR_STIMER0_CONFIG,
 	HV_X64_MSR_VP_ASSIST_PAGE,
 	HV_X64_MSR_REENLIGHTENMENT_CONTROL, HV_X64_MSR_TSC_EMULATION_CONTROL,
-	HV_X64_MSR_TSC_EMULATION_STATUS,
+	HV_X64_MSR_TSC_EMULATION_STATUS, HV_X64_MSR_TSC_INVARIANT_CONTROL,
 	HV_X64_MSR_SYNDBG_OPTIONS,
 	HV_X64_MSR_SYNDBG_CONTROL, HV_X64_MSR_SYNDBG_STATUS,
 	HV_X64_MSR_SYNDBG_SEND_BUFFER, HV_X64_MSR_SYNDBG_RECV_BUFFER,
@@ -3777,6 +3777,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
 		return kvm_hv_set_msr_common(vcpu, msr, data,
 					     msr_info->host_initiated);
 	case MSR_IA32_BBL_CR_CTL3:
@@ -4147,6 +4148,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_CONTROL:
 	case HV_X64_MSR_TSC_EMULATION_STATUS:
+	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
 		return kvm_hv_get_msr_common(vcpu,
 					     msr_info->index, &msr_info->data,
 					     msr_info->host_initiated);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test
  2022-08-31  8:50 [PATCH v2 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
  2022-08-31  8:50 ` [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
@ 2022-08-31  8:50 ` Vitaly Kuznetsov
  2022-08-31 10:44   ` Maxim Levitsky
  2022-09-12 13:59   ` Sean Christopherson
  2022-08-31  8:50 ` [PATCH v2 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
  2 siblings, 2 replies; 12+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-31  8:50 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Michael Kelley, Maxim Levitsky,
	linux-hyperv, linux-kernel

It may not be clear what 'msr->availble' means. The test actually
checks that accessing the particular MSR doesn't cause #GP, rename
the varialble accordingly.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/x86_64/hyperv_features.c    | 92 +++++++++----------
 1 file changed, 46 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
index 79ab0152d281..4ec4776662a4 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
@@ -33,7 +33,7 @@ static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
 
 struct msr_data {
 	uint32_t idx;
-	bool available;
+	bool should_not_gp;
 	bool write;
 	u64 write_val;
 };
@@ -56,7 +56,7 @@ static void guest_msr(struct msr_data *msr)
 	else
 		vector = wrmsr_safe(msr->idx, msr->write_val);
 
-	if (msr->available)
+	if (msr->should_not_gp)
 		GUEST_ASSERT_2(!vector, msr->idx, vector);
 	else
 		GUEST_ASSERT_2(vector == GP_VECTOR, msr->idx, vector);
@@ -153,12 +153,12 @@ static void guest_test_msrs_access(void)
 			 */
 			msr->idx = HV_X64_MSR_GUEST_OS_ID;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 1:
 			msr->idx = HV_X64_MSR_HYPERCALL;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 2:
 			feat->eax |= HV_MSR_HYPERCALL_AVAILABLE;
@@ -169,116 +169,116 @@ static void guest_test_msrs_access(void)
 			msr->idx = HV_X64_MSR_GUEST_OS_ID;
 			msr->write = 1;
 			msr->write_val = LINUX_OS_ID;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 3:
 			msr->idx = HV_X64_MSR_GUEST_OS_ID;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 4:
 			msr->idx = HV_X64_MSR_HYPERCALL;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 5:
 			msr->idx = HV_X64_MSR_VP_RUNTIME;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 6:
 			feat->eax |= HV_MSR_VP_RUNTIME_AVAILABLE;
 			msr->idx = HV_X64_MSR_VP_RUNTIME;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 7:
 			/* Read only */
 			msr->idx = HV_X64_MSR_VP_RUNTIME;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 
 		case 8:
 			msr->idx = HV_X64_MSR_TIME_REF_COUNT;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 9:
 			feat->eax |= HV_MSR_TIME_REF_COUNT_AVAILABLE;
 			msr->idx = HV_X64_MSR_TIME_REF_COUNT;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 10:
 			/* Read only */
 			msr->idx = HV_X64_MSR_TIME_REF_COUNT;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 
 		case 11:
 			msr->idx = HV_X64_MSR_VP_INDEX;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 12:
 			feat->eax |= HV_MSR_VP_INDEX_AVAILABLE;
 			msr->idx = HV_X64_MSR_VP_INDEX;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 13:
 			/* Read only */
 			msr->idx = HV_X64_MSR_VP_INDEX;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 
 		case 14:
 			msr->idx = HV_X64_MSR_RESET;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 15:
 			feat->eax |= HV_MSR_RESET_AVAILABLE;
 			msr->idx = HV_X64_MSR_RESET;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 16:
 			msr->idx = HV_X64_MSR_RESET;
 			msr->write = 1;
 			msr->write_val = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 17:
 			msr->idx = HV_X64_MSR_REFERENCE_TSC;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 18:
 			feat->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
 			msr->idx = HV_X64_MSR_REFERENCE_TSC;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 19:
 			msr->idx = HV_X64_MSR_REFERENCE_TSC;
 			msr->write = 1;
 			msr->write_val = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 20:
 			msr->idx = HV_X64_MSR_EOM;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 21:
 			/*
@@ -287,145 +287,145 @@ static void guest_test_msrs_access(void)
 			 */
 			msr->idx = HV_X64_MSR_EOM;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 22:
 			feat->eax |= HV_MSR_SYNIC_AVAILABLE;
 			msr->idx = HV_X64_MSR_EOM;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 23:
 			msr->idx = HV_X64_MSR_EOM;
 			msr->write = 1;
 			msr->write_val = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 24:
 			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 25:
 			feat->eax |= HV_MSR_SYNTIMER_AVAILABLE;
 			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 26:
 			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
 			msr->write = 1;
 			msr->write_val = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 27:
 			/* Direct mode test */
 			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
 			msr->write = 1;
 			msr->write_val = 1 << 12;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 28:
 			feat->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
 			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
 			msr->write = 1;
 			msr->write_val = 1 << 12;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 29:
 			msr->idx = HV_X64_MSR_EOI;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 30:
 			feat->eax |= HV_MSR_APIC_ACCESS_AVAILABLE;
 			msr->idx = HV_X64_MSR_EOI;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 31:
 			msr->idx = HV_X64_MSR_TSC_FREQUENCY;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 32:
 			feat->eax |= HV_ACCESS_FREQUENCY_MSRS;
 			msr->idx = HV_X64_MSR_TSC_FREQUENCY;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 33:
 			/* Read only */
 			msr->idx = HV_X64_MSR_TSC_FREQUENCY;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 
 		case 34:
 			msr->idx = HV_X64_MSR_REENLIGHTENMENT_CONTROL;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 35:
 			feat->eax |= HV_ACCESS_REENLIGHTENMENT;
 			msr->idx = HV_X64_MSR_REENLIGHTENMENT_CONTROL;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 36:
 			msr->idx = HV_X64_MSR_REENLIGHTENMENT_CONTROL;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 37:
 			/* Can only write '0' */
 			msr->idx = HV_X64_MSR_TSC_EMULATION_STATUS;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 
 		case 38:
 			msr->idx = HV_X64_MSR_CRASH_P0;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 39:
 			feat->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
 			msr->idx = HV_X64_MSR_CRASH_P0;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 40:
 			msr->idx = HV_X64_MSR_CRASH_P0;
 			msr->write = 1;
 			msr->write_val = 1;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 41:
 			msr->idx = HV_X64_MSR_SYNDBG_STATUS;
 			msr->write = 0;
-			msr->available = 0;
+			msr->should_not_gp = 0;
 			break;
 		case 42:
 			feat->edx |= HV_FEATURE_DEBUG_MSRS_AVAILABLE;
 			dbg->eax |= HV_X64_SYNDBG_CAP_ALLOW_KERNEL_DEBUGGING;
 			msr->idx = HV_X64_MSR_SYNDBG_STATUS;
 			msr->write = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 		case 43:
 			msr->idx = HV_X64_MSR_SYNDBG_STATUS;
 			msr->write = 1;
 			msr->write_val = 0;
-			msr->available = 1;
+			msr->should_not_gp = 1;
 			break;
 
 		case 44:
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] KVM: selftests: Test Hyper-V invariant TSC control
  2022-08-31  8:50 [PATCH v2 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
  2022-08-31  8:50 ` [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
  2022-08-31  8:50 ` [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test Vitaly Kuznetsov
@ 2022-08-31  8:50 ` Vitaly Kuznetsov
  2022-09-12 14:29   ` Sean Christopherson
  2 siblings, 1 reply; 12+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-31  8:50 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Michael Kelley, Maxim Levitsky,
	linux-hyperv, linux-kernel

Add a test for the newly introduced Hyper-V invariant TSC control feature:
- HV_X64_MSR_TSC_INVARIANT_CONTROL is not available without
 HV_ACCESS_TSC_INVARIANT CPUID bit set and available with it.
- BIT(0) of HV_X64_MSR_TSC_INVARIANT_CONTROL controls the filtering of
architectural invariant TSC (CPUID.80000007H:EDX[8]) bit.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/x86_64/hyperv_features.c    | 67 +++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
index 4ec4776662a4..26e8c5f7677e 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
@@ -15,6 +15,9 @@
 
 #define LINUX_OS_ID ((u64)0x8100 << 48)
 
+/* CPUID.80000007H:EDX */
+#define X86_FEATURE_INVTSC (1 << 8)
+
 static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
 				vm_vaddr_t output_address, uint64_t *hv_status)
 {
@@ -60,6 +63,24 @@ static void guest_msr(struct msr_data *msr)
 		GUEST_ASSERT_2(!vector, msr->idx, vector);
 	else
 		GUEST_ASSERT_2(vector == GP_VECTOR, msr->idx, vector);
+
+	/* Invariant TSC bit appears when TSC invariant control MSR is written to */
+	if (msr->idx == HV_X64_MSR_TSC_INVARIANT_CONTROL) {
+		u32 eax, ebx, ecx, edx;
+
+		cpuid(0x80000007, &eax, &ebx, &ecx, &edx);
+
+		/*
+		 * TSC invariant bit is present without the feature (legacy) or
+		 * when the feature is present and enabled.
+		 */
+		if ((!msr->should_not_gp && !msr->write) || (msr->write && msr->write_val == 1))
+			GUEST_ASSERT(edx & X86_FEATURE_INVTSC);
+		else
+			GUEST_ASSERT(!(edx & X86_FEATURE_INVTSC));
+	}
+
+
 	GUEST_DONE();
 }
 
@@ -104,6 +125,15 @@ static void vcpu_reset_hv_cpuid(struct kvm_vcpu *vcpu)
 	vcpu_clear_cpuid_entry(vcpu, HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES);
 }
 
+static bool guest_has_invtsc(void)
+{
+	const struct kvm_cpuid_entry2 *cpuid;
+
+	cpuid = kvm_get_supported_cpuid_entry(0x80000007);
+
+	return cpuid->edx & X86_FEATURE_INVTSC;
+}
+
 static void guest_test_msrs_access(void)
 {
 	struct kvm_cpuid2 *prev_cpuid = NULL;
@@ -115,6 +145,7 @@ static void guest_test_msrs_access(void)
 	int stage = 0;
 	vm_vaddr_t msr_gva;
 	struct msr_data *msr;
+	bool has_invtsc = guest_has_invtsc();
 
 	while (true) {
 		vm = vm_create_with_one_vcpu(&vcpu, guest_msr);
@@ -429,6 +460,42 @@ static void guest_test_msrs_access(void)
 			break;
 
 		case 44:
+			/* MSR is not available when CPUID feature bit is unset */
+			if (!has_invtsc)
+				continue;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 0;
+			msr->should_not_gp = 0;
+			break;
+		case 45:
+			/* MSR is vailable when CPUID feature bit is set */
+			if (!has_invtsc)
+				continue;
+			feat->eax |= HV_ACCESS_TSC_INVARIANT;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 0;
+			msr->should_not_gp = 1;
+			break;
+		case 46:
+			/* Writing bits other than 0 is forbidden */
+			if (!has_invtsc)
+				continue;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 1;
+			msr->write_val = 0xdeadbeef;
+			msr->should_not_gp = 0;
+			break;
+		case 47:
+			/* Setting bit 0 enables the feature */
+			if (!has_invtsc)
+				continue;
+			msr->idx = HV_X64_MSR_TSC_INVARIANT_CONTROL;
+			msr->write = 1;
+			msr->write_val = 1;
+			msr->should_not_gp = 1;
+			break;
+
+		default:
 			kvm_vm_free(vm);
 			return;
 		}
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-08-31  8:50 ` [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
@ 2022-08-31 10:43   ` Maxim Levitsky
  2022-09-12 13:51   ` Sean Christopherson
  1 sibling, 0 replies; 12+ messages in thread
From: Maxim Levitsky @ 2022-08-31 10:43 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Michael Kelley, linux-hyperv, linux-kernel

On Wed, 2022-08-31 at 10:50 +0200, Vitaly Kuznetsov wrote:
> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
> 
> Add the feature to KVM. Keep CPUID output intact when the feature
> wasn't exposed to L1 and implement the required logic for hiding
> invariant TSC when the feature was exposed and invariant TSC control
> MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
> disable the feature once it was enabled.
> 
> For the reference, for linux guests, support for the feature was added
> in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/cpuid.c            |  7 +++++++
>  arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
>  arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
>  arch/x86/kvm/x86.c              |  4 +++-
>  5 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 2c96c43c313a..9098187e13aa 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1021,6 +1021,7 @@ struct kvm_hv {
>  	u64 hv_reenlightenment_control;
>  	u64 hv_tsc_emulation_control;
>  	u64 hv_tsc_emulation_status;
> +	u64 hv_invtsc;
>  
>  	/* How many vCPUs have VP index != vCPU index */
>  	atomic_t num_mismatched_vp_indexes;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 75dcf7a72605..8ccd45fd66a9 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1444,6 +1444,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>  			    (data & TSX_CTRL_CPUID_CLEAR))
>  				*ebx &= ~(F(RTM) | F(HLE));
>  		}
> +		/*
> +		 * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
> +		 * guests if needed.
> +		 */
> +		if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))
> +			*edx &= ~(1 << 8);
> +
>  	} else {
>  		*eax = *ebx = *ecx = *edx = 0;
>  		/*
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index ed804447589c..df90cd7501b9 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
>  	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>  		r = true;
> @@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
>  		return hv_vcpu->cpuid_cache.features_eax &
>  			HV_ACCESS_REENLIGHTENMENT;
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +		return hv_vcpu->cpuid_cache.features_eax &
> +			HV_ACCESS_TSC_INVARIANT;
>  	case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
>  	case HV_X64_MSR_CRASH_CTL:
>  		return hv_vcpu->cpuid_cache.features_edx &
> @@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
>  		if (!host)
>  			return 1;
>  		break;
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +		/* Only bit 0 is supported */
> +		if (data & ~BIT_ULL(0))
> +			return 1;
> +
> +		/* The feature can't be disabled from the guest */
> +		if (!host && hv->hv_invtsc && !data)
> +			return 1;
> +
> +		hv->hv_invtsc = data;
> +		break;
>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>  		return syndbg_set_msr(vcpu, msr, data, host);
> @@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
>  		data = hv->hv_tsc_emulation_status;
>  		break;
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +		data = hv->hv_invtsc;
> +		break;
>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>  		return syndbg_get_msr(vcpu, msr, pdata, host);
> @@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>  			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>  			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
>  			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
> +			ent->eax |= HV_ACCESS_TSC_INVARIANT;
>  
>  			ent->ebx |= HV_POST_MESSAGES;
>  			ent->ebx |= HV_SIGNAL_EVENTS;
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index da2737f2a956..1a6316ab55eb 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
>  			     HV_SYNIC_STIMER_COUNT);
>  }
>  
> +/*
> + * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
> + * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
> + */
> +static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> +	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
> +
> +	if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)
> +		return !hv->hv_invtsc;
> +
> +	return false;
> +}
> +
>  void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
>  
>  void kvm_hv_setup_tsc_page(struct kvm *kvm,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d7374d768296..ad429800f9b5 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1471,7 +1471,7 @@ static const u32 emulated_msrs_all[] = {
>  	HV_X64_MSR_STIMER0_CONFIG,
>  	HV_X64_MSR_VP_ASSIST_PAGE,
>  	HV_X64_MSR_REENLIGHTENMENT_CONTROL, HV_X64_MSR_TSC_EMULATION_CONTROL,
> -	HV_X64_MSR_TSC_EMULATION_STATUS,
> +	HV_X64_MSR_TSC_EMULATION_STATUS, HV_X64_MSR_TSC_INVARIANT_CONTROL,
>  	HV_X64_MSR_SYNDBG_OPTIONS,
>  	HV_X64_MSR_SYNDBG_CONTROL, HV_X64_MSR_SYNDBG_STATUS,
>  	HV_X64_MSR_SYNDBG_SEND_BUFFER, HV_X64_MSR_SYNDBG_RECV_BUFFER,
> @@ -3777,6 +3777,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>  		return kvm_hv_set_msr_common(vcpu, msr, data,
>  					     msr_info->host_initiated);
>  	case MSR_IA32_BBL_CR_CTL3:
> @@ -4147,6 +4148,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>  		return kvm_hv_get_msr_common(vcpu,
>  					     msr_info->index, &msr_info->data,
>  					     msr_info->host_initiated);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test
  2022-08-31  8:50 ` [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test Vitaly Kuznetsov
@ 2022-08-31 10:44   ` Maxim Levitsky
  2022-09-12 13:59   ` Sean Christopherson
  1 sibling, 0 replies; 12+ messages in thread
From: Maxim Levitsky @ 2022-08-31 10:44 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Michael Kelley, linux-hyperv, linux-kernel

On Wed, 2022-08-31 at 10:50 +0200, Vitaly Kuznetsov wrote:
> It may not be clear what 'msr->availble' means. The test actually
> checks that accessing the particular MSR doesn't cause #GP, rename
> the varialble accordingly.
> 
> Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  .../selftests/kvm/x86_64/hyperv_features.c    | 92 +++++++++----------
>  1 file changed, 46 insertions(+), 46 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> index 79ab0152d281..4ec4776662a4 100644
> --- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> @@ -33,7 +33,7 @@ static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
>  
>  struct msr_data {
>  	uint32_t idx;
> -	bool available;
> +	bool should_not_gp;
>  	bool write;
>  	u64 write_val;
>  };
> @@ -56,7 +56,7 @@ static void guest_msr(struct msr_data *msr)
>  	else
>  		vector = wrmsr_safe(msr->idx, msr->write_val);
>  
> -	if (msr->available)
> +	if (msr->should_not_gp)
>  		GUEST_ASSERT_2(!vector, msr->idx, vector);
>  	else
>  		GUEST_ASSERT_2(vector == GP_VECTOR, msr->idx, vector);
> @@ -153,12 +153,12 @@ static void guest_test_msrs_access(void)
>  			 */
>  			msr->idx = HV_X64_MSR_GUEST_OS_ID;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 1:
>  			msr->idx = HV_X64_MSR_HYPERCALL;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 2:
>  			feat->eax |= HV_MSR_HYPERCALL_AVAILABLE;
> @@ -169,116 +169,116 @@ static void guest_test_msrs_access(void)
>  			msr->idx = HV_X64_MSR_GUEST_OS_ID;
>  			msr->write = 1;
>  			msr->write_val = LINUX_OS_ID;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 3:
>  			msr->idx = HV_X64_MSR_GUEST_OS_ID;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 4:
>  			msr->idx = HV_X64_MSR_HYPERCALL;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 5:
>  			msr->idx = HV_X64_MSR_VP_RUNTIME;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 6:
>  			feat->eax |= HV_MSR_VP_RUNTIME_AVAILABLE;
>  			msr->idx = HV_X64_MSR_VP_RUNTIME;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 7:
>  			/* Read only */
>  			msr->idx = HV_X64_MSR_VP_RUNTIME;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  
>  		case 8:
>  			msr->idx = HV_X64_MSR_TIME_REF_COUNT;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 9:
>  			feat->eax |= HV_MSR_TIME_REF_COUNT_AVAILABLE;
>  			msr->idx = HV_X64_MSR_TIME_REF_COUNT;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 10:
>  			/* Read only */
>  			msr->idx = HV_X64_MSR_TIME_REF_COUNT;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  
>  		case 11:
>  			msr->idx = HV_X64_MSR_VP_INDEX;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 12:
>  			feat->eax |= HV_MSR_VP_INDEX_AVAILABLE;
>  			msr->idx = HV_X64_MSR_VP_INDEX;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 13:
>  			/* Read only */
>  			msr->idx = HV_X64_MSR_VP_INDEX;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  
>  		case 14:
>  			msr->idx = HV_X64_MSR_RESET;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 15:
>  			feat->eax |= HV_MSR_RESET_AVAILABLE;
>  			msr->idx = HV_X64_MSR_RESET;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 16:
>  			msr->idx = HV_X64_MSR_RESET;
>  			msr->write = 1;
>  			msr->write_val = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 17:
>  			msr->idx = HV_X64_MSR_REFERENCE_TSC;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 18:
>  			feat->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>  			msr->idx = HV_X64_MSR_REFERENCE_TSC;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 19:
>  			msr->idx = HV_X64_MSR_REFERENCE_TSC;
>  			msr->write = 1;
>  			msr->write_val = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 20:
>  			msr->idx = HV_X64_MSR_EOM;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 21:
>  			/*
> @@ -287,145 +287,145 @@ static void guest_test_msrs_access(void)
>  			 */
>  			msr->idx = HV_X64_MSR_EOM;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 22:
>  			feat->eax |= HV_MSR_SYNIC_AVAILABLE;
>  			msr->idx = HV_X64_MSR_EOM;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 23:
>  			msr->idx = HV_X64_MSR_EOM;
>  			msr->write = 1;
>  			msr->write_val = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 24:
>  			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 25:
>  			feat->eax |= HV_MSR_SYNTIMER_AVAILABLE;
>  			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 26:
>  			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
>  			msr->write = 1;
>  			msr->write_val = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 27:
>  			/* Direct mode test */
>  			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
>  			msr->write = 1;
>  			msr->write_val = 1 << 12;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 28:
>  			feat->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
>  			msr->idx = HV_X64_MSR_STIMER0_CONFIG;
>  			msr->write = 1;
>  			msr->write_val = 1 << 12;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 29:
>  			msr->idx = HV_X64_MSR_EOI;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 30:
>  			feat->eax |= HV_MSR_APIC_ACCESS_AVAILABLE;
>  			msr->idx = HV_X64_MSR_EOI;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 31:
>  			msr->idx = HV_X64_MSR_TSC_FREQUENCY;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 32:
>  			feat->eax |= HV_ACCESS_FREQUENCY_MSRS;
>  			msr->idx = HV_X64_MSR_TSC_FREQUENCY;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 33:
>  			/* Read only */
>  			msr->idx = HV_X64_MSR_TSC_FREQUENCY;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  
>  		case 34:
>  			msr->idx = HV_X64_MSR_REENLIGHTENMENT_CONTROL;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 35:
>  			feat->eax |= HV_ACCESS_REENLIGHTENMENT;
>  			msr->idx = HV_X64_MSR_REENLIGHTENMENT_CONTROL;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 36:
>  			msr->idx = HV_X64_MSR_REENLIGHTENMENT_CONTROL;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 37:
>  			/* Can only write '0' */
>  			msr->idx = HV_X64_MSR_TSC_EMULATION_STATUS;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  
>  		case 38:
>  			msr->idx = HV_X64_MSR_CRASH_P0;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 39:
>  			feat->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
>  			msr->idx = HV_X64_MSR_CRASH_P0;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 40:
>  			msr->idx = HV_X64_MSR_CRASH_P0;
>  			msr->write = 1;
>  			msr->write_val = 1;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 41:
>  			msr->idx = HV_X64_MSR_SYNDBG_STATUS;
>  			msr->write = 0;
> -			msr->available = 0;
> +			msr->should_not_gp = 0;
>  			break;
>  		case 42:
>  			feat->edx |= HV_FEATURE_DEBUG_MSRS_AVAILABLE;
>  			dbg->eax |= HV_X64_SYNDBG_CAP_ALLOW_KERNEL_DEBUGGING;
>  			msr->idx = HV_X64_MSR_SYNDBG_STATUS;
>  			msr->write = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  		case 43:
>  			msr->idx = HV_X64_MSR_SYNDBG_STATUS;
>  			msr->write = 1;
>  			msr->write_val = 0;
> -			msr->available = 1;
> +			msr->should_not_gp = 1;
>  			break;
>  
>  		case 44:


Thanks,

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-08-31  8:50 ` [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
  2022-08-31 10:43   ` Maxim Levitsky
@ 2022-09-12 13:51   ` Sean Christopherson
  2022-09-13  9:51     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2022-09-12 13:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Wed, Aug 31, 2022, Vitaly Kuznetsov wrote:
> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
> 
> Add the feature to KVM. Keep CPUID output intact when the feature
> wasn't exposed to L1 and implement the required logic for hiding
> invariant TSC when the feature was exposed and invariant TSC control
> MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
> disable the feature once it was enabled.
> 
> For the reference, for linux guests, support for the feature was added
> in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/cpuid.c            |  7 +++++++
>  arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
>  arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
>  arch/x86/kvm/x86.c              |  4 +++-
>  5 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 2c96c43c313a..9098187e13aa 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1021,6 +1021,7 @@ struct kvm_hv {
>  	u64 hv_reenlightenment_control;
>  	u64 hv_tsc_emulation_control;
>  	u64 hv_tsc_emulation_status;
> +	u64 hv_invtsc;

For consistency with the other fields, should this be hv_tsc_invariant_control?
>  
>  	/* How many vCPUs have VP index != vCPU index */
>  	atomic_t num_mismatched_vp_indexes;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 75dcf7a72605..8ccd45fd66a9 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1444,6 +1444,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>  			    (data & TSX_CTRL_CPUID_CLEAR))
>  				*ebx &= ~(F(RTM) | F(HLE));
>  		}
> +		/*
> +		 * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
> +		 * guests if needed.
> +		 */
> +		if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))

This can be an else-if.  Kinda weird, but it could be written as

		else if (function = 0x80000007) {
			if (kvm_hv_invtsc_filtered(vcpu))
				*edx &= ~SF(CONSTANT_TSC)
		}

to make it a pure function+index check.

> +			*edx &= ~(1 << 8);

Ugh, scattered.  Can you add a kvm_only_cpuid_leafs entry so that the bit doesn't
have to be open coded?

> +
>  	} else {
>  		*eax = *ebx = *ecx = *edx = 0;
>  		/*
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index ed804447589c..df90cd7501b9 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
>  	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_CONTROL:
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>  		r = true;
> @@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
>  		return hv_vcpu->cpuid_cache.features_eax &
>  			HV_ACCESS_REENLIGHTENMENT;
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +		return hv_vcpu->cpuid_cache.features_eax &
> +			HV_ACCESS_TSC_INVARIANT;
>  	case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
>  	case HV_X64_MSR_CRASH_CTL:
>  		return hv_vcpu->cpuid_cache.features_edx &
> @@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
>  		if (!host)
>  			return 1;
>  		break;
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +		/* Only bit 0 is supported */
> +		if (data & ~BIT_ULL(0))

Can a #define be added instead of open coding bit 0?

> +			return 1;
> +

Doesn't the host CPUID need to be honored on writes from the guest?

> +		/* The feature can't be disabled from the guest */
> +		if (!host && hv->hv_invtsc && !data)
> +			return 1;
> +
> +		hv->hv_invtsc = data;
> +		break;
>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>  		return syndbg_set_msr(vcpu, msr, data, host);
> @@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
>  		data = hv->hv_tsc_emulation_status;
>  		break;
> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
> +		data = hv->hv_invtsc;
> +		break;
>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>  		return syndbg_get_msr(vcpu, msr, pdata, host);
> @@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>  			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>  			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
>  			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
> +			ent->eax |= HV_ACCESS_TSC_INVARIANT;
>  
>  			ent->ebx |= HV_POST_MESSAGES;
>  			ent->ebx |= HV_SIGNAL_EVENTS;
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index da2737f2a956..1a6316ab55eb 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
>  			     HV_SYNIC_STIMER_COUNT);
>  }
>  
> +/*
> + * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
> + * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
> + */
> +static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)

Can this be more strongly worded, e.g. maybe kvm_hv_is_invtsc_disabled()?  "Filtered"
doesn't strictly mean disabled and makes it sound like there's something else that
needs to act on the "filtering"

> +{
> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> +	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
> +
> +	if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)

Ah, I almost missed the inner check.  Can you write this as:

	if (!hv_vcpu)
		return false;

so that the potentially postive/happy path is at the end?  I.e. follow the common
pattern of:

	if (!something)
		return -ERRNO;

	return 0;

> +		return !hv->hv_invtsc;

Kinda silly, but I think it's worth checking the exact bit here.  I don't see how
the TSC can get more invariant, but if another bit is added, this could silently
break.  And probably no need to grab to_kvm_v() locally.

	return to_kvm_hv(vcpu->kvm)->hv_invtsc;


> +
> +	return false;

Shouldn't this be "return true" if HyperV is enabled but doesn't have the CPUID
bit set?  I assume the expectation is that host userspace won't set the common
INVTSC flag without also setting HV_ACCESS_TSC_INVARIANT, but it's confusing logic
as is.

All in all, I think this?

	if (!hv_vcpu)
		return false;

	return hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT &&
	       to_kvm_hv(vcpu->kvm)->hv_invtsc & BIT(0);

> +}
> +
>  void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
>  
>  void kvm_hv_setup_tsc_page(struct kvm *kvm,

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test
  2022-08-31  8:50 ` [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test Vitaly Kuznetsov
  2022-08-31 10:44   ` Maxim Levitsky
@ 2022-09-12 13:59   ` Sean Christopherson
  2022-09-13  9:52     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2022-09-12 13:59 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Wed, Aug 31, 2022, Vitaly Kuznetsov wrote:
> It may not be clear what 'msr->availble' means. The test actually
> checks that accessing the particular MSR doesn't cause #GP, rename
> the varialble accordingly.
> 
> Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  .../selftests/kvm/x86_64/hyperv_features.c    | 92 +++++++++----------
>  1 file changed, 46 insertions(+), 46 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> index 79ab0152d281..4ec4776662a4 100644
> --- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> @@ -33,7 +33,7 @@ static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
>  
>  struct msr_data {
>  	uint32_t idx;
> -	bool available;
> +	bool should_not_gp;

I agree that "available" is a bit inscrutable, but "should_not_gp" is also odd.

What about inverting it to?

	bool gp_expected;

or maybe even just

	bool fault_expected;

and letting the assert define which vector is expected.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] KVM: selftests: Test Hyper-V invariant TSC control
  2022-08-31  8:50 ` [PATCH v2 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
@ 2022-09-12 14:29   ` Sean Christopherson
  0 siblings, 0 replies; 12+ messages in thread
From: Sean Christopherson @ 2022-09-12 14:29 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Wed, Aug 31, 2022, Vitaly Kuznetsov wrote:
> Add a test for the newly introduced Hyper-V invariant TSC control feature:
> - HV_X64_MSR_TSC_INVARIANT_CONTROL is not available without
>  HV_ACCESS_TSC_INVARIANT CPUID bit set and available with it.
> - BIT(0) of HV_X64_MSR_TSC_INVARIANT_CONTROL controls the filtering of
> architectural invariant TSC (CPUID.80000007H:EDX[8]) bit.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  .../selftests/kvm/x86_64/hyperv_features.c    | 67 +++++++++++++++++++
>  1 file changed, 67 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> index 4ec4776662a4..26e8c5f7677e 100644
> --- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> @@ -15,6 +15,9 @@
>  
>  #define LINUX_OS_ID ((u64)0x8100 << 48)
>  
> +/* CPUID.80000007H:EDX */
> +#define X86_FEATURE_INVTSC (1 << 8)
> +
>  static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
>  				vm_vaddr_t output_address, uint64_t *hv_status)
>  {
> @@ -60,6 +63,24 @@ static void guest_msr(struct msr_data *msr)
>  		GUEST_ASSERT_2(!vector, msr->idx, vector);
>  	else
>  		GUEST_ASSERT_2(vector == GP_VECTOR, msr->idx, vector);
> +
> +	/* Invariant TSC bit appears when TSC invariant control MSR is written to */
> +	if (msr->idx == HV_X64_MSR_TSC_INVARIANT_CONTROL) {
> +		u32 eax, ebx, ecx, edx;
> +
> +		cpuid(0x80000007, &eax, &ebx, &ecx, &edx);

Add a proper kvm_x86_cpu_feature so that this is simply

		this_cpu_has(X86_FEATURE_INVTSC)

> +
> +		/*
> +		 * TSC invariant bit is present without the feature (legacy) or
> +		 * when the feature is present and enabled.
> +		 */
> +		if ((!msr->should_not_gp && !msr->write) || (msr->write && msr->write_val == 1))

Relying purely on the inputs is rather nasty as it creates a subtle dependency
on the "write 1" testcase coming last.  This function already reads the guest
MSR value, just use that to check if INVTSC should be enabled.  And if we want
to verify KVM "wrote" the correct value, then that can be done in the common
path.

And I think that will make this code self-documenting, e.g.

	if (msr->idx == HV_X64_MSR_TSC_INVARIANT_CONTROL)
		GUEST_ASSERT(this_cpu_has(X86_FEATURE_INVTSC) ==
			     !!(msr_val & ...));

> +			GUEST_ASSERT(edx & X86_FEATURE_INVTSC);
> +		else
> +			GUEST_ASSERT(!(edx & X86_FEATURE_INVTSC));
> +	}
> +
> +
>  	GUEST_DONE();
>  }
>  
> @@ -104,6 +125,15 @@ static void vcpu_reset_hv_cpuid(struct kvm_vcpu *vcpu)
>  	vcpu_clear_cpuid_entry(vcpu, HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES);
>  }
>  
> +static bool guest_has_invtsc(void)
> +{
> +	const struct kvm_cpuid_entry2 *cpuid;
> +
> +	cpuid = kvm_get_supported_cpuid_entry(0x80000007);
> +
> +	return cpuid->edx & X86_FEATURE_INVTSC;
> +}
> +
>  static void guest_test_msrs_access(void)
>  {
>  	struct kvm_cpuid2 *prev_cpuid = NULL;
> @@ -115,6 +145,7 @@ static void guest_test_msrs_access(void)
>  	int stage = 0;
>  	vm_vaddr_t msr_gva;
>  	struct msr_data *msr;
> +	bool has_invtsc = guest_has_invtsc();

Huh, I never added vcpu_has_cpuid_feature()?  Can you add that instead of
open-coding the check?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-09-12 13:51   ` Sean Christopherson
@ 2022-09-13  9:51     ` Vitaly Kuznetsov
  2022-09-14  7:48       ` Sean Christopherson
  0 siblings, 1 reply; 12+ messages in thread
From: Vitaly Kuznetsov @ 2022-09-13  9:51 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Maxim Levitsky, linux-hyperv, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Wed, Aug 31, 2022, Vitaly Kuznetsov wrote:
>> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
>> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
>> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
>> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
>> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
>> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
>> 
>> Add the feature to KVM. Keep CPUID output intact when the feature
>> wasn't exposed to L1 and implement the required logic for hiding
>> invariant TSC when the feature was exposed and invariant TSC control
>> MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
>> disable the feature once it was enabled.
>> 
>> For the reference, for linux guests, support for the feature was added
>> in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/include/asm/kvm_host.h |  1 +
>>  arch/x86/kvm/cpuid.c            |  7 +++++++
>>  arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
>>  arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
>>  arch/x86/kvm/x86.c              |  4 +++-
>>  5 files changed, 45 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 2c96c43c313a..9098187e13aa 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1021,6 +1021,7 @@ struct kvm_hv {
>>  	u64 hv_reenlightenment_control;
>>  	u64 hv_tsc_emulation_control;
>>  	u64 hv_tsc_emulation_status;
>> +	u64 hv_invtsc;
>
> For consistency with the other fields, should this be hv_tsc_invariant_control?

Yep.

>>  
>>  	/* How many vCPUs have VP index != vCPU index */
>>  	atomic_t num_mismatched_vp_indexes;
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 75dcf7a72605..8ccd45fd66a9 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -1444,6 +1444,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>  			    (data & TSX_CTRL_CPUID_CLEAR))
>>  				*ebx &= ~(F(RTM) | F(HLE));
>>  		}
>> +		/*
>> +		 * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
>> +		 * guests if needed.
>> +		 */
>> +		if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))
>
> This can be an else-if.  Kinda weird, but it could be written as
>
> 		else if (function = 0x80000007) {
> 			if (kvm_hv_invtsc_filtered(vcpu))
> 				*edx &= ~SF(CONSTANT_TSC)
> 		}
>
> to make it a pure function+index check.
>
>> +			*edx &= ~(1 << 8);
>
> Ugh, scattered.  Can you add a kvm_only_cpuid_leafs entry so that the bit doesn't
> have to be open coded?

Sure.

>
>> +
>>  	} else {
>>  		*eax = *ebx = *ecx = *edx = 0;
>>  		/*
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index ed804447589c..df90cd7501b9 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
>>  	case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>>  	case HV_X64_MSR_TSC_EMULATION_CONTROL:
>>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
>> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>  		r = true;
>> @@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
>>  		return hv_vcpu->cpuid_cache.features_eax &
>>  			HV_ACCESS_REENLIGHTENMENT;
>> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +		return hv_vcpu->cpuid_cache.features_eax &
>> +			HV_ACCESS_TSC_INVARIANT;
>>  	case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
>>  	case HV_X64_MSR_CRASH_CTL:
>>  		return hv_vcpu->cpuid_cache.features_edx &
>> @@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
>>  		if (!host)
>>  			return 1;
>>  		break;
>> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +		/* Only bit 0 is supported */
>> +		if (data & ~BIT_ULL(0))
>
> Can a #define be added instead of open coding bit 0?
>

Yes, and then we can avoid open coding it in Linux-on-Hyper-V code too
as it looks like

arch/x86/kernel/cpu/mshyperv.c:         wrmsrl(HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x1);

today.


>> +			return 1;
>> +
>
> Doesn't the host CPUID need to be honored on writes from the guest?
>

You mean INVTSC itself (CPUID.80000007H:EDX[8])? That's a good
question. Genuine Hyper-V will never expose HV_ACCESS_TSC_INVARIANT
without it but a misbehaving KVM VMM can. In case we treat the feature
as a 'filter' only, we don't need to check for the architectural bit.

>> +		/* The feature can't be disabled from the guest */
>> +		if (!host && hv->hv_invtsc && !data)
>> +			return 1;
>> +
>> +		hv->hv_invtsc = data;
>> +		break;
>>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>  		return syndbg_set_msr(vcpu, msr, data, host);
>> @@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
>>  	case HV_X64_MSR_TSC_EMULATION_STATUS:
>>  		data = hv->hv_tsc_emulation_status;
>>  		break;
>> +	case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +		data = hv->hv_invtsc;
>> +		break;
>>  	case HV_X64_MSR_SYNDBG_OPTIONS:
>>  	case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>  		return syndbg_get_msr(vcpu, msr, pdata, host);
>> @@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>  			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>>  			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
>>  			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
>> +			ent->eax |= HV_ACCESS_TSC_INVARIANT;
>>  
>>  			ent->ebx |= HV_POST_MESSAGES;
>>  			ent->ebx |= HV_SIGNAL_EVENTS;
>> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
>> index da2737f2a956..1a6316ab55eb 100644
>> --- a/arch/x86/kvm/hyperv.h
>> +++ b/arch/x86/kvm/hyperv.h
>> @@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
>>  			     HV_SYNIC_STIMER_COUNT);
>>  }
>>  
>> +/*
>> + * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
>> + * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
>> + */
>> +static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
>
> Can this be more strongly worded, e.g. maybe kvm_hv_is_invtsc_disabled()?  "Filtered"
> doesn't strictly mean disabled and makes it sound like there's something else that
> needs to act on the "filtering"
>

"Hidden"? :-) I'm OK with kvm_hv_is_invtsc_disabled() too.

>> +{
>> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
>> +
>> +	if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)
>
> Ah, I almost missed the inner check.  Can you write this as:
>
> 	if (!hv_vcpu)
> 		return false;
>
> so that the potentially postive/happy path is at the end?  I.e. follow the common
> pattern of:
>
> 	if (!something)
> 		return -ERRNO;
>
> 	return 0;
>

Sure.

>> +		return !hv->hv_invtsc;
>
> Kinda silly, but I think it's worth checking the exact bit here.  I don't see how
> the TSC can get more invariant, but if another bit is added, this could silently
> break.  And probably no need to grab to_kvm_v() locally.
>
> 	return to_kvm_hv(vcpu->kvm)->hv_invtsc;
>

Sure.

>
>> +
>> +	return false;
>
> Shouldn't this be "return true" if HyperV is enabled but doesn't have the CPUID
> bit set?  I assume the expectation is that host userspace won't set the common
> INVTSC flag without also setting HV_ACCESS_TSC_INVARIANT, but it's confusing logic
> as is.
>
> All in all, I think this?
>
> 	if (!hv_vcpu)
> 		return false;
>
> 	return hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT &&
> 	       to_kvm_hv(vcpu->kvm)->hv_invtsc & BIT(0);
>

Actually yes, there might be some configurations out there which expose
INVTSC to Hyper-V guests without this new PV feature, no need to break them.

>> +}
>> +
>>  void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
>>  
>>  void kvm_hv_setup_tsc_page(struct kvm *kvm,
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test
  2022-09-12 13:59   ` Sean Christopherson
@ 2022-09-13  9:52     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 12+ messages in thread
From: Vitaly Kuznetsov @ 2022-09-13  9:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Maxim Levitsky, linux-hyperv, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Wed, Aug 31, 2022, Vitaly Kuznetsov wrote:
>> It may not be clear what 'msr->availble' means. The test actually
>> checks that accessing the particular MSR doesn't cause #GP, rename
>> the varialble accordingly.
>> 
>> Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  .../selftests/kvm/x86_64/hyperv_features.c    | 92 +++++++++----------
>>  1 file changed, 46 insertions(+), 46 deletions(-)
>> 
>> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
>> index 79ab0152d281..4ec4776662a4 100644
>> --- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
>> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
>> @@ -33,7 +33,7 @@ static inline uint8_t hypercall(u64 control, vm_vaddr_t input_address,
>>  
>>  struct msr_data {
>>  	uint32_t idx;
>> -	bool available;
>> +	bool should_not_gp;
>
> I agree that "available" is a bit inscrutable, but "should_not_gp" is also odd.
>

I think Max suggested it to reduce the code churn and I silently agreed.

> What about inverting it to?
>
> 	bool gp_expected;
>
> or maybe even just
>
> 	bool fault_expected;
>
> and letting the assert define which vector is expected.
>

This also works, will change.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control
  2022-09-13  9:51     ` Vitaly Kuznetsov
@ 2022-09-14  7:48       ` Sean Christopherson
  0 siblings, 0 replies; 12+ messages in thread
From: Sean Christopherson @ 2022-09-14  7:48 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Tue, Sep 13, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > On Wed, Aug 31, 2022, Vitaly Kuznetsov wrote:
> >>  
> >> +/*
> >> + * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
> >> + * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
> >> + */
> >> +static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
> >
> > Can this be more strongly worded, e.g. maybe kvm_hv_is_invtsc_disabled()?  "Filtered"
> > doesn't strictly mean disabled and makes it sound like there's something else that
> > needs to act on the "filtering"
> >
> 
> "Hidden"? :-) I'm OK with kvm_hv_is_invtsc_disabled() too.

Hidden works for me.  Or suppressed, inihbited, whatever.  Just not filtered :-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-09-14  7:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-31  8:50 [PATCH v2 0/3] KVM: x86: Hyper-V invariant TSC control feature Vitaly Kuznetsov
2022-08-31  8:50 ` [PATCH v2 1/3] KVM: x86: Hyper-V invariant TSC control Vitaly Kuznetsov
2022-08-31 10:43   ` Maxim Levitsky
2022-09-12 13:51   ` Sean Christopherson
2022-09-13  9:51     ` Vitaly Kuznetsov
2022-09-14  7:48       ` Sean Christopherson
2022-08-31  8:50 ` [PATCH v2 2/3] KVM: selftests: Rename 'msr->availble' to 'msr->should_not_gp' in hyperv_features test Vitaly Kuznetsov
2022-08-31 10:44   ` Maxim Levitsky
2022-09-12 13:59   ` Sean Christopherson
2022-09-13  9:52     ` Vitaly Kuznetsov
2022-08-31  8:50 ` [PATCH v2 3/3] KVM: selftests: Test Hyper-V invariant TSC control Vitaly Kuznetsov
2022-09-12 14:29   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).