All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support
@ 2023-05-30  6:04 Like Xu
  2023-05-30  6:04 ` [PATCH v6 01/10] KVM: x86/pmu: Expose reprogram_counters() in pmu.h Like Xu
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

Starting with Zen4, core PMU on AMD platforms such as Genoa and
Ryzen-7000 will support PerfMonV2, and it is also compatible with
legacy PERFCTR_CORE behavior and MSR addresses.

If you don't have access to the hardware specification, the commits
d6d0c7f681fd..7685665c390d for host perf can also bring a quick
overview. Its main change is the addition of three MSR's equivalent
to Intel V2, namely global_ctrl, global_status, global_status_clear.

It is worth noting that this feature is very attractive for reducing the
overhead of PMU virtualization, since multiple MSR accesses to multiple
counters will be replaced by a single access to the global register,
plus more accuracy gain when multiple guest counters are used.

All related testcases are passed on a Genoa box.
Please feel free to run more tests, add more or share comments.

Patch 0001-0007 could be applied earlier, which may help reduce
the burden on industrious reviewers.

Previous:
https://lore.kernel.org/kvm/20230214050757.9623-1-likexu@tencent.com/

V5 -> V6 Changelog:
- Introduce pmu_ops->MIN_NR_GP_COUNTERS and WARN_ON_ONCE; (Sean)
- Set MIN_NR_GP_COUNTERS = 2 instead of 1 for Intel Arch PMU;

Like Xu (10):
  KVM: x86/pmu: Expose reprogram_counters() in pmu.h
  KVM: x86/pmu: Return #GP if user sets the GLOBAL_STATUS reserved bits
  KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic
  KVM: x86: Explicitly zero cpuid "0xa" leaf when PMU is disabled
  KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met
  KVM: x86/pmu: Forget PERFCTR_CORE if the min num of counters isn't met
  KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap
  KVM: x86/cpuid: Add a KVM-only leaf to redirect AMD PerfMonV2 flag
  KVM: x86/svm/pmu: Add AMD PerfMonV2 support
  KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022

 arch/x86/include/asm/kvm-x86-pmu-ops.h |  1 -
 arch/x86/kvm/cpuid.c                   | 30 +++++++++-
 arch/x86/kvm/pmu.c                     | 83 +++++++++++++++++++++++---
 arch/x86/kvm/pmu.h                     | 43 +++++++++++--
 arch/x86/kvm/reverse_cpuid.h           |  7 +++
 arch/x86/kvm/svm/pmu.c                 | 68 +++++++++++++++------
 arch/x86/kvm/svm/svm.c                 | 19 +++++-
 arch/x86/kvm/vmx/pmu_intel.c           | 33 +++-------
 arch/x86/kvm/x86.c                     | 10 ++++
 9 files changed, 230 insertions(+), 64 deletions(-)


base-commit: b9846a698c9aff4eb2214a06ac83638ad098f33f
-- 
2.40.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v6 01/10] KVM: x86/pmu: Expose reprogram_counters() in pmu.h
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-05-30  6:04 ` [PATCH v6 02/10] KVM: x86/pmu: Return #GP if user sets the GLOBAL_STATUS reserved bits Like Xu
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

The optimization stands on its own, whereas the code movement is
justified only by the incoming AMD PMU v2 support.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.h           | 12 ++++++++++++
 arch/x86/kvm/vmx/pmu_intel.c | 12 ------------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 5c7bbf03b599..986563aeeef8 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -201,6 +201,18 @@ static inline void kvm_pmu_request_counter_reprogram(struct kvm_pmc *pmc)
 	kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
 }
 
+static inline void reprogram_counters(struct kvm_pmu *pmu, u64 diff)
+{
+	int bit;
+
+	if (!diff)
+		return;
+
+	for_each_set_bit(bit, (unsigned long *)&diff, X86_PMC_IDX_MAX)
+		set_bit(bit, pmu->reprogram_pmi);
+	kvm_make_request(KVM_REQ_PMU, pmu_to_vcpu(pmu));
+}
+
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu);
 void kvm_pmu_handle_event(struct kvm_vcpu *vcpu);
 int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 741efe2c497b..1f9c3e916a21 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -73,18 +73,6 @@ static struct kvm_pmc *intel_pmc_idx_to_pmc(struct kvm_pmu *pmu, int pmc_idx)
 	}
 }
 
-static void reprogram_counters(struct kvm_pmu *pmu, u64 diff)
-{
-	int bit;
-
-	if (!diff)
-		return;
-
-	for_each_set_bit(bit, (unsigned long *)&diff, X86_PMC_IDX_MAX)
-		set_bit(bit, pmu->reprogram_pmi);
-	kvm_make_request(KVM_REQ_PMU, pmu_to_vcpu(pmu));
-}
-
 static bool intel_hw_event_available(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 02/10] KVM: x86/pmu: Return #GP if user sets the GLOBAL_STATUS reserved bits
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
  2023-05-30  6:04 ` [PATCH v6 01/10] KVM: x86/pmu: Expose reprogram_counters() in pmu.h Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-06-02 21:59   ` Sean Christopherson
  2023-05-30  6:04 ` [PATCH v6 03/10] KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic Like Xu
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

Return #GP if KVM user space attempts to set a reserved bit for guest.
If the user space sets reserved bits when restoring the MSR_CORE_
PERF_GLOBAL_STATUS register, these bits will be accidentally returned
when the guest runs a read access to this register, and cannot be cleared
up inside the guest, which makes the guest's PMI handler very confused.

Note, reusing global_ovf_ctrl_mask as global_status_mask will be broken
if KVM supports higher versions of Intel arch pmu.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 1f9c3e916a21..343b3182b7f4 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -399,7 +399,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			reprogram_fixed_counters(pmu, data);
 		break;
 	case MSR_CORE_PERF_GLOBAL_STATUS:
-		if (!msr_info->host_initiated)
+		/*
+		 * Caution, the assumption here is that some of the bits (such as
+		 * ASCI, CTR_FREEZE, and LBR_FREEZE) are not yet supported by KVM.
+		 */
+		if (!msr_info->host_initiated || (data & pmu->global_ovf_ctrl_mask))
 			return 1; /* RO MSR */
 
 		pmu->global_status = data;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 03/10] KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
  2023-05-30  6:04 ` [PATCH v6 01/10] KVM: x86/pmu: Expose reprogram_counters() in pmu.h Like Xu
  2023-05-30  6:04 ` [PATCH v6 02/10] KVM: x86/pmu: Return #GP if user sets the GLOBAL_STATUS reserved bits Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-06-02 23:23   ` Sean Christopherson
  2023-05-30  6:04 ` [PATCH v6 04/10] KVM: x86: Explicitly zero cpuid "0xa" leaf when PMU is disabled Like Xu
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

AMD PerfMonV2 defines three registers similar to part of the Intel
v2 PMU registers, including the GLOBAL_CTRL, GLOBAL_STATUS and
GLOBAL_OVF_CTRL MSRs. For better code reuse, this specific part of
the handling can be extracted to make it generic for X86 as a straight
code movement.

Specifically, the kvm_pmu_set/get_msr() handlers of GLOBAL_STATUS,
GLOBAL_CTRL, GLOBAL_OVF_CTRL defined for Intel are moved to generic
pmu.c and the callback function .pmc_is_globally_enabled is removed,
which is very helpful to introduce the AMD PerfMonV2 code later.

The new eponymous pmc_is_globally_enabled() works well as legacy AMD
vPMU version is indexed as 1. Note that the specific *_is_valid_msr will
continue to be used to avoid cross-vendor MSR access.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm-x86-pmu-ops.h |  1 -
 arch/x86/kvm/pmu.c                     | 61 ++++++++++++++++++++++----
 arch/x86/kvm/pmu.h                     | 17 ++++++-
 arch/x86/kvm/svm/pmu.c                 |  9 ----
 arch/x86/kvm/vmx/pmu_intel.c           | 14 +-----
 5 files changed, 70 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index c17e3e96fc1d..6c98f4bb4228 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -13,7 +13,6 @@ BUILD_BUG_ON(1)
  * at the call sites.
  */
 KVM_X86_PMU_OP(hw_event_available)
-KVM_X86_PMU_OP(pmc_is_enabled)
 KVM_X86_PMU_OP(pmc_idx_to_pmc)
 KVM_X86_PMU_OP(rdpmc_ecx_to_pmc)
 KVM_X86_PMU_OP(msr_idx_to_pmc)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 1690d41c1830..116bc7fe13c6 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -93,11 +93,6 @@ void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
 #undef __KVM_X86_PMU_OP
 }
 
-static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
-{
-	return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
-}
-
 static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 {
 	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
@@ -577,13 +572,63 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	return static_call(kvm_x86_pmu_get_msr)(vcpu, msr_info);
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	u32 msr = msr_info->index;
+
+	switch (msr) {
+	case MSR_CORE_PERF_GLOBAL_STATUS:
+		msr_info->data = pmu->global_status;
+		break;
+	case MSR_CORE_PERF_GLOBAL_CTRL:
+		msr_info->data = pmu->global_ctrl;
+		break;
+	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+		msr_info->data = 0;
+		break;
+	default:
+		return static_call(kvm_x86_pmu_get_msr)(vcpu, msr_info);
+	}
+
+	return 0;
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
-	return static_call(kvm_x86_pmu_set_msr)(vcpu, msr_info);
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	u32 msr = msr_info->index;
+	u64 data = msr_info->data;
+	u64 diff;
+
+	switch (msr) {
+	case MSR_CORE_PERF_GLOBAL_STATUS:
+		if (!msr_info->host_initiated || (data & pmu->global_ovf_ctrl_mask))
+			return 1; /* RO MSR */
+
+		pmu->global_status = data;
+		break;
+	case MSR_CORE_PERF_GLOBAL_CTRL:
+		if (!kvm_valid_perf_global_ctrl(pmu, data))
+			return 1;
+
+		if (pmu->global_ctrl != data) {
+			diff = pmu->global_ctrl ^ data;
+			pmu->global_ctrl = data;
+			reprogram_counters(pmu, diff);
+		}
+		break;
+	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+		if (data & pmu->global_ovf_ctrl_mask)
+			return 1;
+
+		if (!msr_info->host_initiated)
+			pmu->global_status &= ~data;
+		break;
+	default:
+		kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
+		return static_call(kvm_x86_pmu_set_msr)(vcpu, msr_info);
+	}
+
+	return 0;
 }
 
 /* refresh PMU settings. This function generally is called when underlying
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 986563aeeef8..dd7c7d4ffe3b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -20,7 +20,6 @@
 
 struct kvm_pmu_ops {
 	bool (*hw_event_available)(struct kvm_pmc *pmc);
-	bool (*pmc_is_enabled)(struct kvm_pmc *pmc);
 	struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
 	struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
 		unsigned int idx, u64 *mask);
@@ -213,6 +212,22 @@ static inline void reprogram_counters(struct kvm_pmu *pmu, u64 diff)
 	kvm_make_request(KVM_REQ_PMU, pmu_to_vcpu(pmu));
 }
 
+/*
+ * Check if a PMC is enabled by comparing it against global_ctrl bits.
+ *
+ * If the current version of vPMU doesn't have global_ctrl MSR,
+ * all vPMCs are enabled (return TRUE).
+ */
+static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
+{
+	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
+	if (pmu->version < 2)
+		return true;
+
+	return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
+}
+
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu);
 void kvm_pmu_handle_event(struct kvm_vcpu *vcpu);
 int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 5fa939e411d8..70143275e0a7 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -78,14 +78,6 @@ static bool amd_hw_event_available(struct kvm_pmc *pmc)
 	return true;
 }
 
-/* check if a PMC is enabled by comparing it against global_ctrl bits. Because
- * AMD CPU doesn't have global_ctrl MSR, all PMCs are enabled (return TRUE).
- */
-static bool amd_pmc_is_enabled(struct kvm_pmc *pmc)
-{
-	return true;
-}
-
 static bool amd_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -220,7 +212,6 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
 
 struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.hw_event_available = amd_hw_event_available,
-	.pmc_is_enabled = amd_pmc_is_enabled,
 	.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
 	.rdpmc_ecx_to_pmc = amd_rdpmc_ecx_to_pmc,
 	.msr_idx_to_pmc = amd_msr_idx_to_pmc,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 343b3182b7f4..99d07ccb1869 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -95,17 +95,6 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc)
 	return true;
 }
 
-/* check if a PMC is enabled by comparing it with globl_ctrl bits. */
-static bool intel_pmc_is_enabled(struct kvm_pmc *pmc)
-{
-	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-
-	if (!intel_pmu_has_perf_global_ctrl(pmu))
-		return true;
-
-	return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
-}
-
 static bool intel_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -793,7 +782,7 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)
 		pmc = intel_pmc_idx_to_pmc(pmu, bit);
 
 		if (!pmc || !pmc_speculative_in_use(pmc) ||
-		    !intel_pmc_is_enabled(pmc) || !pmc->perf_event)
+		    !pmc_is_globally_enabled(pmc) || !pmc->perf_event)
 			continue;
 
 		/*
@@ -808,7 +797,6 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)
 
 struct kvm_pmu_ops intel_pmu_ops __initdata = {
 	.hw_event_available = intel_hw_event_available,
-	.pmc_is_enabled = intel_pmc_is_enabled,
 	.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
 	.rdpmc_ecx_to_pmc = intel_rdpmc_ecx_to_pmc,
 	.msr_idx_to_pmc = intel_msr_idx_to_pmc,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 04/10] KVM: x86: Explicitly zero cpuid "0xa" leaf when PMU is disabled
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
                   ` (2 preceding siblings ...)
  2023-05-30  6:04 ` [PATCH v6 03/10] KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-05-30  6:04 ` [PATCH v6 05/10] KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met Like Xu
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

Add an explicit !enable_pmu check as relying on kvm_pmu_cap to be
zeroed isn't obvious. Although when !enable_pmu, KVM will have
zero-padded kvm_pmu_cap to do subsequent CPUID leaf assignments.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0c9660a07b23..61bc71882f07 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -948,7 +948,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 		union cpuid10_eax eax;
 		union cpuid10_edx edx;
 
-		if (!static_cpu_has(X86_FEATURE_ARCH_PERFMON)) {
+		if (!enable_pmu || !static_cpu_has(X86_FEATURE_ARCH_PERFMON)) {
 			entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
 			break;
 		}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 05/10] KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
                   ` (3 preceding siblings ...)
  2023-05-30  6:04 ` [PATCH v6 04/10] KVM: x86: Explicitly zero cpuid "0xa" leaf when PMU is disabled Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-06-02  0:50   ` Sean Christopherson
  2023-05-30  6:04 ` [PATCH v6 06/10] KVM: x86/pmu: Forget PERFCTR_CORE if the min " Like Xu
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Jim Mattson

From: Like Xu <likexu@tencent.com>

Disable PMU support when running on AMD and perf reports fewer than four
general purpose counters. All AMD PMUs must define at least four counters
due to AMD's legacy architecture hardcoding the number of counters
without providing a way to enumerate the number of counters to software,
e.g. from AMD's APM:

 The legacy architecture defines four performance counters (PerfCtrn)
 and corresponding event-select registers (PerfEvtSeln).

Virtualizing fewer than four counters can lead to guest instability as
software expects four counters to be available. Rather than bleed AMD
details into the common code, just define a const unsigned int and
provide a convenient location to document why Intel and AMD have different
mins (in particular, AMD's lack of any way to enumerate less than four
counters to the guest).

According to Intel SDM, "Intel Core Solo and Intel Core Duo processors
support base level functionality identified by version ID of 1. Processors
based on Intel Core microarchitecture support, at a minimum, the base
level functionality of architectural performance monitoring." Those
antique processors mentioned above all had at least two GP counters,
subsequent processors had more and more GP counters, and given KVM's
quirky handling of MSR_P6_PERFCTR0/1, the value of MIN_NR_GP_COUNTERS
for the Intel Arch PMU can safely be 2.

Cc: Jim Mattson <jmattson@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.h           | 14 ++++++++++----
 arch/x86/kvm/svm/pmu.c       |  1 +
 arch/x86/kvm/vmx/pmu_intel.c |  1 +
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index dd7c7d4ffe3b..019a8ed51b12 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -36,6 +36,7 @@ struct kvm_pmu_ops {
 
 	const u64 EVENTSEL_EVENT;
 	const int MAX_NR_GP_COUNTERS;
+	const int MIN_NR_GP_COUNTERS;
 };
 
 void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops);
@@ -160,6 +161,7 @@ extern struct x86_pmu_capability kvm_pmu_cap;
 static inline void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops)
 {
 	bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL;
+	int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS;
 
 	/*
 	 * Hybrid PMUs don't play nice with virtualization without careful
@@ -174,11 +176,15 @@ static inline void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops)
 		perf_get_x86_pmu_capability(&kvm_pmu_cap);
 
 		/*
-		 * For Intel, only support guest architectural pmu
-		 * on a host with architectural pmu.
+		 * WARN if perf did NOT disable hardware PMU if the number of
+		 * architecturally required GP counters aren't present, i.e. if
+		 * there are a non-zero number of counters, but fewer than what
+		 * is architecturally required.
 		 */
-		if ((is_intel && !kvm_pmu_cap.version) ||
-		    !kvm_pmu_cap.num_counters_gp)
+		if (!kvm_pmu_cap.num_counters_gp ||
+		    WARN_ON_ONCE(kvm_pmu_cap.num_counters_gp < min_nr_gp_ctrs))
+			enable_pmu = false;
+		else if (is_intel && !kvm_pmu_cap.version)
 			enable_pmu = false;
 	}
 
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 70143275e0a7..e5c69062a909 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -224,4 +224,5 @@ struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.reset = amd_pmu_reset,
 	.EVENTSEL_EVENT = AMD64_EVENTSEL_EVENT,
 	.MAX_NR_GP_COUNTERS = KVM_AMD_PMC_MAX_GENERIC,
+	.MIN_NR_GP_COUNTERS = AMD64_NUM_COUNTERS,
 };
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 99d07ccb1869..08851b49e1d4 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -811,4 +811,5 @@ struct kvm_pmu_ops intel_pmu_ops __initdata = {
 	.cleanup = intel_pmu_cleanup,
 	.EVENTSEL_EVENT = ARCH_PERFMON_EVENTSEL_EVENT,
 	.MAX_NR_GP_COUNTERS = KVM_INTEL_PMC_MAX_GENERIC,
+	.MIN_NR_GP_COUNTERS = 2,
 };
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 06/10] KVM: x86/pmu: Forget PERFCTR_CORE if the min num of counters isn't met
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
                   ` (4 preceding siblings ...)
  2023-05-30  6:04 ` [PATCH v6 05/10] KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-05-30  6:04 ` [PATCH v6 07/10] KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap Like Xu
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

A sanity check on the number of counters enumerated by perf is added.
PERFCTR_CORE support is explicitly dropped if the min number isn't met.
E.g. if KVM needs 6 counters and perf says there are 4, then something
is wrong and enumerating 6 to a guest is only going to cause more issues.

Opportunistically, the kvm_cpu_cap_check_and_set() is applied to simplify
the host check before setting the PERFCTR_CORE flag.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/svm/svm.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ca32389f3c36..d9669e3cc00a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5025,9 +5025,18 @@ static __init void svm_set_cpu_caps(void)
 	    boot_cpu_has(X86_FEATURE_AMD_SSBD))
 		kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
 
-	/* AMD PMU PERFCTR_CORE CPUID */
-	if (enable_pmu && boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
-		kvm_cpu_cap_set(X86_FEATURE_PERFCTR_CORE);
+	if (enable_pmu) {
+		/*
+		 * Enumerate support for PERFCTR_CORE if and only if KVM has
+		 * access to enough counters to virtualize "core" support,
+		 * otherwise limit vPMU support to the legacy number of counters.
+		 */
+		if (kvm_pmu_cap.num_counters_gp < AMD64_NUM_COUNTERS_CORE)
+			kvm_pmu_cap.num_counters_gp = min(AMD64_NUM_COUNTERS,
+							  kvm_pmu_cap.num_counters_gp);
+		else
+			kvm_cpu_cap_check_and_set(X86_FEATURE_PERFCTR_CORE);
+	}
 
 	/* CPUID 0x8000001F (SME/SEV features) */
 	sev_set_cpu_caps();
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 07/10] KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
                   ` (5 preceding siblings ...)
  2023-05-30  6:04 ` [PATCH v6 06/10] KVM: x86/pmu: Forget PERFCTR_CORE if the min " Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-05-30  6:04 ` [PATCH v6 08/10] KVM: x86/cpuid: Add a KVM-only leaf to redirect AMD PerfMonV2 flag Like Xu
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

A sanity check is added to limit the number of AMD guest counters,
which help avoid a situation if KVM only has access to 4 counters, but
user space sets guest X86_FEATURE_PERFCTR_CORE anyways.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/svm/pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index e5c69062a909..c03958063a76 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -170,6 +170,9 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
 	else
 		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
 
+	pmu->nr_arch_gp_counters = min_t(unsigned int, pmu->nr_arch_gp_counters,
+					 kvm_pmu_cap.num_counters_gp);
+
 	pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << 48) - 1;
 	pmu->reserved_bits = 0xfffffff000280000ull;
 	pmu->raw_event_mask = AMD64_RAW_EVENT_MASK;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 08/10] KVM: x86/cpuid: Add a KVM-only leaf to redirect AMD PerfMonV2 flag
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
                   ` (6 preceding siblings ...)
  2023-05-30  6:04 ` [PATCH v6 07/10] KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-05-30  6:04 ` [PATCH v6 09/10] KVM: x86/svm/pmu: Add AMD PerfMonV2 support Like Xu
  2023-05-30  6:04 ` [PATCH v6 10/10] KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022 Like Xu
  9 siblings, 0 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

A KVM-only leaf for AMD's PerfMonV2 feature flag is defined to redirect
the kernel's scattered version to its architectural location, e.g. so that
KVM can query guest support via guest_cpuid_has().

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/reverse_cpuid.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index a5717282bb9c..56cbdb24400a 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -15,6 +15,7 @@ enum kvm_only_cpuid_leafs {
 	CPUID_12_EAX	 = NCAPINTS,
 	CPUID_7_1_EDX,
 	CPUID_8000_0007_EDX,
+	CPUID_8000_0022_EAX,
 	NR_KVM_CPU_CAPS,
 
 	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
@@ -47,6 +48,9 @@ enum kvm_only_cpuid_leafs {
 /* CPUID level 0x80000007 (EDX). */
 #define KVM_X86_FEATURE_CONSTANT_TSC	KVM_X86_FEATURE(CPUID_8000_0007_EDX, 8)
 
+/* CPUID level 0x80000022 (EAX) */
+#define KVM_X86_FEATURE_PERFMON_V2	KVM_X86_FEATURE(CPUID_8000_0022_EAX, 0)
+
 struct cpuid_reg {
 	u32 function;
 	u32 index;
@@ -74,6 +78,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
 	[CPUID_7_1_EDX]       = {         7, 1, CPUID_EDX},
 	[CPUID_8000_0007_EDX] = {0x80000007, 0, CPUID_EDX},
 	[CPUID_8000_0021_EAX] = {0x80000021, 0, CPUID_EAX},
+	[CPUID_8000_0022_EAX] = {0x80000022, 0, CPUID_EAX},
 };
 
 /*
@@ -108,6 +113,8 @@ static __always_inline u32 __feature_translate(int x86_feature)
 		return KVM_X86_FEATURE_SGX_EDECCSSA;
 	else if (x86_feature == X86_FEATURE_CONSTANT_TSC)
 		return KVM_X86_FEATURE_CONSTANT_TSC;
+	else if (x86_feature == X86_FEATURE_PERFMON_V2)
+		return KVM_X86_FEATURE_PERFMON_V2;
 
 	return x86_feature;
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 09/10] KVM: x86/svm/pmu: Add AMD PerfMonV2 support
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
                   ` (7 preceding siblings ...)
  2023-05-30  6:04 ` [PATCH v6 08/10] KVM: x86/cpuid: Add a KVM-only leaf to redirect AMD PerfMonV2 flag Like Xu
@ 2023-05-30  6:04 ` Like Xu
  2023-05-30  6:04 ` [PATCH v6 10/10] KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022 Like Xu
  9 siblings, 0 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Sandipan Das

From: Like Xu <likexu@tencent.com>

If AMD Performance Monitoring Version 2 (PerfMonV2) is detected by
the guest, it can use a new scheme to manage the Core PMCs using the
new global control and status registers.

In addition to benefiting from the PerfMonV2 functionality in the same
way as the host (higher precision), the guest also can reduce the number
of vm-exits by lowering the total number of MSRs accesses.

In terms of implementation details, amd_is_valid_msr() is resurrected
since three newly added MSRs could not be mapped to one vPMC.
The possibility of emulating PerfMonV2 on the mainframe has also
been eliminated for reasons of precision.

Co-developed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c     | 24 +++++++++++++++++-
 arch/x86/kvm/svm/pmu.c | 55 ++++++++++++++++++++++++++++++++++--------
 arch/x86/kvm/x86.c     | 10 ++++++++
 3 files changed, 78 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 116bc7fe13c6..e17be25de6ca 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -577,11 +577,18 @@ int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 
 	switch (msr) {
 	case MSR_CORE_PERF_GLOBAL_STATUS:
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
 		msr_info->data = pmu->global_status;
 		break;
+	case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+		/* Based on the observed HW. */
+		fallthrough;
 	case MSR_CORE_PERF_GLOBAL_CTRL:
 		msr_info->data = pmu->global_ctrl;
 		break;
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
+		/* Based on the observed HW. */
+		fallthrough;
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
 		msr_info->data = 0;
 		break;
@@ -599,13 +606,26 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	u64 data = msr_info->data;
 	u64 diff;
 
+	/*
+	 * Note, AMD ignores writes to reserved bits and read-only PMU MSRs,
+	 * whereas Intel generates #GP on attempts to write reserved/RO MSRs.
+	 */
 	switch (msr) {
 	case MSR_CORE_PERF_GLOBAL_STATUS:
 		if (!msr_info->host_initiated || (data & pmu->global_ovf_ctrl_mask))
 			return 1; /* RO MSR */
+		fallthrough;
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+		/* Per PPR, Read-only MSR. Writes are ignored. */
+		if (!msr_info->host_initiated)
+			break;
 
 		pmu->global_status = data;
 		break;
+	case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+		/* Based on the observed HW. */
+		data &= ~pmu->global_ctrl_mask;
+		fallthrough;
 	case MSR_CORE_PERF_GLOBAL_CTRL:
 		if (!kvm_valid_perf_global_ctrl(pmu, data))
 			return 1;
@@ -619,7 +639,9 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
 		if (data & pmu->global_ovf_ctrl_mask)
 			return 1;
-
+		fallthrough;
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
+		/* Based on the observed HW. */
 		if (!msr_info->host_initiated)
 			pmu->global_status &= ~data;
 		break;
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index c03958063a76..9bcf1dacfd6c 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -94,12 +94,6 @@ static struct kvm_pmc *amd_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
 	return amd_pmc_idx_to_pmc(vcpu_to_pmu(vcpu), idx & ~(3u << 30));
 }
 
-static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
-{
-	/* All MSRs refer to exactly one PMC, so msr_idx_to_pmc is enough.  */
-	return false;
-}
-
 static struct kvm_pmc *amd_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -111,6 +105,29 @@ static struct kvm_pmc *amd_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
 	return pmc;
 }
 
+static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+
+	switch (msr) {
+	case MSR_K7_EVNTSEL0 ... MSR_K7_PERFCTR3:
+		return pmu->version > 0;
+	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
+		return guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE);
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+	case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
+		return pmu->version > 1;
+	default:
+		if (msr > MSR_F15H_PERF_CTR5 &&
+		    msr < MSR_F15H_PERF_CTL0 + 2 * pmu->nr_arch_gp_counters)
+			return pmu->version > 1;
+		break;
+	}
+
+	return amd_msr_idx_to_pmc(vcpu, msr);
+}
+
 static int amd_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -164,23 +181,39 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	union cpuid_0x80000022_ebx ebx;
 
-	if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
+	pmu->version = 1;
+	if (guest_cpuid_has(vcpu, X86_FEATURE_PERFMON_V2)) {
+		pmu->version = 2;
+		/*
+		 * Note, PERFMON_V2 is also in 0x80000022.0x0, i.e. the guest
+		 * CPUID entry is guaranteed to be non-NULL.
+		 */
+		BUILD_BUG_ON(x86_feature_cpuid(X86_FEATURE_PERFMON_V2).function != 0x80000022 ||
+			     x86_feature_cpuid(X86_FEATURE_PERFMON_V2).index);
+		ebx.full = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0)->ebx;
+		pmu->nr_arch_gp_counters = ebx.split.num_core_pmc;
+	} else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
 		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
-	else
+	} else {
 		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
+	}
 
 	pmu->nr_arch_gp_counters = min_t(unsigned int, pmu->nr_arch_gp_counters,
 					 kvm_pmu_cap.num_counters_gp);
 
+	if (pmu->version > 1) {
+		pmu->global_ctrl_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1);
+		pmu->global_ovf_ctrl_mask = pmu->global_ctrl_mask;
+	}
+
 	pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << 48) - 1;
 	pmu->reserved_bits = 0xfffffff000280000ull;
 	pmu->raw_event_mask = AMD64_RAW_EVENT_MASK;
-	pmu->version = 1;
 	/* not applicable to AMD; but clean them to prevent any fall out */
 	pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
 	pmu->nr_arch_fixed_counters = 0;
-	pmu->global_status = 0;
 	bitmap_set(pmu->all_valid_pmc_idx, 0, pmu->nr_arch_gp_counters);
 }
 
@@ -211,6 +244,8 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
 		pmc_stop_counter(pmc);
 		pmc->counter = pmc->prev_counter = pmc->eventsel = 0;
 	}
+
+	pmu->global_ctrl = pmu->global_status = 0;
 }
 
 struct kvm_pmu_ops amd_pmu_ops __initdata = {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c0778ca39650..abfba3cae0ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1483,6 +1483,10 @@ static const u32 msrs_to_save_pmu[] = {
 	MSR_F15H_PERF_CTL3, MSR_F15H_PERF_CTL4, MSR_F15H_PERF_CTL5,
 	MSR_F15H_PERF_CTR0, MSR_F15H_PERF_CTR1, MSR_F15H_PERF_CTR2,
 	MSR_F15H_PERF_CTR3, MSR_F15H_PERF_CTR4, MSR_F15H_PERF_CTR5,
+
+	MSR_AMD64_PERF_CNTR_GLOBAL_CTL,
+	MSR_AMD64_PERF_CNTR_GLOBAL_STATUS,
+	MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR,
 };
 
 static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_base) +
@@ -7150,6 +7154,12 @@ static void kvm_probe_msr_to_save(u32 msr_index)
 		    kvm_pmu_cap.num_counters_fixed)
 			return;
 		break;
+	case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
+		if (!kvm_cpu_cap_has(X86_FEATURE_PERFMON_V2))
+			return;
+		break;
 	case MSR_IA32_XFD:
 	case MSR_IA32_XFD_ERR:
 		if (!kvm_cpu_cap_has(X86_FEATURE_XFD))
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 10/10] KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022
  2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
                   ` (8 preceding siblings ...)
  2023-05-30  6:04 ` [PATCH v6 09/10] KVM: x86/svm/pmu: Add AMD PerfMonV2 support Like Xu
@ 2023-05-30  6:04 ` Like Xu
  9 siblings, 0 replies; 14+ messages in thread
From: Like Xu @ 2023-05-30  6:04 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Sandipan Das

From: Like Xu <likexu@tencent.com>

CPUID leaf 0x80000022 i.e. ExtPerfMonAndDbg advertises some new
performance monitoring features for AMD processors.

Bit 0 of EAX indicates support for Performance Monitoring Version 2
(PerfMonV2) features. If found to be set during PMU initialization,
the EBX bits of the same CPUID function can be used to determine
the number of available PMCs for different PMU types.

Expose the relevant bits via KVM_GET_SUPPORTED_CPUID so that
guests can make use of the PerfMonV2 features.

Co-developed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/cpuid.c   | 28 +++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.c |  4 ++++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 61bc71882f07..0e5584f4acd7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -734,6 +734,10 @@ void kvm_set_cpu_caps(void)
 		F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */
 	);
 
+	kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
+		F(PERFMON_V2)
+	);
+
 	/*
 	 * Synthesize "LFENCE is serializing" into the AMD-defined entry in
 	 * KVM's supported CPUID if the feature is reported as supported by the
@@ -1128,7 +1132,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 		entry->edx = 0;
 		break;
 	case 0x80000000:
-		entry->eax = min(entry->eax, 0x80000021);
+		entry->eax = min(entry->eax, 0x80000022);
 		/*
 		 * Serializing LFENCE is reported in a multitude of ways, and
 		 * NullSegClearsBase is not reported in CPUID on Zen2; help
@@ -1233,6 +1237,28 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 		entry->ebx = entry->ecx = entry->edx = 0;
 		cpuid_entry_override(entry, CPUID_8000_0021_EAX);
 		break;
+	/* AMD Extended Performance Monitoring and Debug */
+	case 0x80000022: {
+		union cpuid_0x80000022_ebx ebx;
+
+		entry->ecx = entry->edx = 0;
+		if (!enable_pmu || !kvm_cpu_cap_has(X86_FEATURE_PERFMON_V2)) {
+			entry->eax = entry->ebx;
+			break;
+		}
+
+		cpuid_entry_override(entry, CPUID_8000_0022_EAX);
+
+		if (kvm_cpu_cap_has(X86_FEATURE_PERFMON_V2))
+			ebx.split.num_core_pmc = kvm_pmu_cap.num_counters_gp;
+		else if (kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE))
+			ebx.split.num_core_pmc = AMD64_NUM_COUNTERS_CORE;
+		else
+			ebx.split.num_core_pmc = AMD64_NUM_COUNTERS;
+
+		entry->ebx = ebx.full;
+		break;
+	}
 	/*Add support for Centaur's CPUID instruction*/
 	case 0xC0000000:
 		/*Just support up to 0xC0000004 now*/
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d9669e3cc00a..ff48cdea1fbf 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5036,6 +5036,10 @@ static __init void svm_set_cpu_caps(void)
 							  kvm_pmu_cap.num_counters_gp);
 		else
 			kvm_cpu_cap_check_and_set(X86_FEATURE_PERFCTR_CORE);
+
+		if (kvm_pmu_cap.version != 2 ||
+		    !kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE))
+			kvm_cpu_cap_clear(X86_FEATURE_PERFMON_V2);
 	}
 
 	/* CPUID 0x8000001F (SME/SEV features) */
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 05/10] KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met
  2023-05-30  6:04 ` [PATCH v6 05/10] KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met Like Xu
@ 2023-06-02  0:50   ` Sean Christopherson
  0 siblings, 0 replies; 14+ messages in thread
From: Sean Christopherson @ 2023-06-02  0:50 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm, linux-kernel, Jim Mattson

On Tue, May 30, 2023, Like Xu wrote:
> According to Intel SDM, "Intel Core Solo and Intel Core Duo processors
> support base level functionality identified by version ID of 1. Processors
> based on Intel Core microarchitecture support, at a minimum, the base
> level functionality of architectural performance monitoring." Those
> antique processors mentioned above all had at least two GP counters,
> subsequent processors had more and more GP counters, and given KVM's
> quirky handling of MSR_P6_PERFCTR0/1, the value of MIN_NR_GP_COUNTERS
> for the Intel Arch PMU can safely be 2.

Not sure what you mean by "safely", but I don't think this is correct.  KVM can,
and more importantly has up until this point, supported a vPMU so long as the CPU
has at least one counter.  Perf's support for P6/Core CPUs does appear to expect
and require 2 counters, but unless I'm misreading arch/x86/events/intel/core.c,
perf will happily chug along with a single counter when running on a modern CPU.
I doubt such a CPU exists in real silicon, but I can certainly imagine a virtual
CPU being configured with a single counter, and this change would break such a
setup.

And *if* we really want to raise the minimum to '2', that should be done in a
separate commit.  But I don't see any reason to force the issue.

No need to send v7 just for this, I can fixup when applying (planning on reviewing
the series tomorrow).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 02/10] KVM: x86/pmu: Return #GP if user sets the GLOBAL_STATUS reserved bits
  2023-05-30  6:04 ` [PATCH v6 02/10] KVM: x86/pmu: Return #GP if user sets the GLOBAL_STATUS reserved bits Like Xu
@ 2023-06-02 21:59   ` Sean Christopherson
  0 siblings, 0 replies; 14+ messages in thread
From: Sean Christopherson @ 2023-06-02 21:59 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm, linux-kernel

On Tue, May 30, 2023, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Return #GP if KVM user space attempts to set a reserved bit for guest.

It's not a #GP, it's simply an error.

> If the user space sets reserved bits when restoring the MSR_CORE_
> PERF_GLOBAL_STATUS register, these bits will be accidentally returned
> when the guest runs a read access to this register, and cannot be cleared
> up inside the guest, which makes the guest's PMI handler very confused.
> 
> Note, reusing global_ovf_ctrl_mask as global_status_mask will be broken
> if KVM supports higher versions of Intel arch pmu.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/kvm/vmx/pmu_intel.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 1f9c3e916a21..343b3182b7f4 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -399,7 +399,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			reprogram_fixed_counters(pmu, data);
>  		break;
>  	case MSR_CORE_PERF_GLOBAL_STATUS:
> -		if (!msr_info->host_initiated)
> +		/*
> +		 * Caution, the assumption here is that some of the bits (such as
> +		 * ASCI, CTR_FREEZE, and LBR_FREEZE) are not yet supported by KVM.
> +		 */

A comment wasn't what I had in mind when I objected to "good enough"[*].  Luckily,
there's no need to add another mask.  After rereading the SDM, there isn't actually
a divergence between GLOBAL_STATUS and GLOBAL_OVF_CTRL, Intel just renamed GLOBAL_OVF_CTRL
to GLOBAL_STATUS_RESET and for some asinine reason, added separate entries in the
architectural MSRs table instead of adding a redirect.

I'll post and slot in a patch to do s/global_ovf_ctrl_mask/global_status, along
with a comment explaining the rename and how the "reset" MSR is always tied to the
status MSR, regardless of what it's named.

[*] https://lore.kernel.org/all/37a18b89-c0c3-4c88-7f07-072573ac0c92@gmail.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 03/10] KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic
  2023-05-30  6:04 ` [PATCH v6 03/10] KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic Like Xu
@ 2023-06-02 23:23   ` Sean Christopherson
  0 siblings, 0 replies; 14+ messages in thread
From: Sean Christopherson @ 2023-06-02 23:23 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm, linux-kernel

On Tue, May 30, 2023, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> AMD PerfMonV2 defines three registers similar to part of the Intel
> v2 PMU registers, including the GLOBAL_CTRL, GLOBAL_STATUS and
> GLOBAL_OVF_CTRL MSRs. For better code reuse, this specific part of
> the handling can be extracted to make it generic for X86 as a straight
> code movement.
> 
> Specifically, the kvm_pmu_set/get_msr() handlers of GLOBAL_STATUS,
> GLOBAL_CTRL, GLOBAL_OVF_CTRL defined for Intel are moved to generic
> pmu.c and the callback function .pmc_is_globally_enabled is removed,
> which is very helpful to introduce the AMD PerfMonV2 code later.

Yeah, except this patch doesn't actually move anything.  *Some* of the common bits
show up in pmu.c, but the same bits in pmu_intel.c get left behind. 

> The new eponymous pmc_is_globally_enabled() works well as legacy AMD
> vPMU version is indexed as 1. Note that the specific *_is_valid_msr will
> continue to be used to avoid cross-vendor MSR access.

This should be two patches.  Moving the GLOBAL_CTRL stuff is logically separate
from moving the pmc_is_enabled() code.

> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
> @@ -213,6 +212,22 @@ static inline void reprogram_counters(struct kvm_pmu *pmu, u64 diff)
>  	kvm_make_request(KVM_REQ_PMU, pmu_to_vcpu(pmu));
>  }
>  
> +/*
> + * Check if a PMC is enabled by comparing it against global_ctrl bits.
> + *
> + * If the current version of vPMU doesn't have global_ctrl MSR,
> + * all vPMCs are enabled (return TRUE).
> + */
> +static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
> +{
> +	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> +
> +	if (pmu->version < 2)

Nah, we're not open coding this check, not after putting in the effort to squash
the mess of open coding on Intel.  Moving intel_pmu_has_perf_global_ctrl() is
trivial, and also allows moving the existence checks into kvm_pmu_is_valid_msr().

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-06-02 23:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-30  6:04 [PATCH v6 00/10] KVM: x86: Add AMD Guest PerfMonV2 PMU support Like Xu
2023-05-30  6:04 ` [PATCH v6 01/10] KVM: x86/pmu: Expose reprogram_counters() in pmu.h Like Xu
2023-05-30  6:04 ` [PATCH v6 02/10] KVM: x86/pmu: Return #GP if user sets the GLOBAL_STATUS reserved bits Like Xu
2023-06-02 21:59   ` Sean Christopherson
2023-05-30  6:04 ` [PATCH v6 03/10] KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic Like Xu
2023-06-02 23:23   ` Sean Christopherson
2023-05-30  6:04 ` [PATCH v6 04/10] KVM: x86: Explicitly zero cpuid "0xa" leaf when PMU is disabled Like Xu
2023-05-30  6:04 ` [PATCH v6 05/10] KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met Like Xu
2023-06-02  0:50   ` Sean Christopherson
2023-05-30  6:04 ` [PATCH v6 06/10] KVM: x86/pmu: Forget PERFCTR_CORE if the min " Like Xu
2023-05-30  6:04 ` [PATCH v6 07/10] KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap Like Xu
2023-05-30  6:04 ` [PATCH v6 08/10] KVM: x86/cpuid: Add a KVM-only leaf to redirect AMD PerfMonV2 flag Like Xu
2023-05-30  6:04 ` [PATCH v6 09/10] KVM: x86/svm/pmu: Add AMD PerfMonV2 support Like Xu
2023-05-30  6:04 ` [PATCH v6 10/10] KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022 Like Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.