All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions
@ 2022-11-02 22:50 Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 01/27] x86/pmu: Add PDCM check before accessing PERF_CAP register Sean Christopherson
                   ` (28 more replies)
  0 siblings, 29 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

This series is a big pile of PMU cleanups and enhancements from Like.

The changes are roughly divided into three parts: (1) fixes (2) cleanups,
and (3) new test cases.  The changes are bundled in a mega-series as the
original, separate series was difficult to review/manage due to a number
of dependencies.

There are no major changes in the test logic. The big cleanups are to add
lib/x86/pmu.[c,h] and a global PMU capabilities struct to improve
readability of the code and to hide some AMD vs. Intel details.

Like's v4 was tested on AMD Zen3/4 and Intel ICX/SPR machines, but this
version has only been tested on AMD Zen3 (Milan) and Intel ICX and HSW,
i.e. I haven't tested AMD PMU v2 or anything new in SPR (if there is
anything in SPR?).

Like Xu (22):
  x86/pmu: Add PDCM check before accessing PERF_CAP register
  x86/pmu: Test emulation instructions on full-width counters
  x86/pmu: Pop up FW prefix to avoid out-of-context propagation
  x86/pmu: Report SKIP when testing Intel LBR on AMD platforms
  x86/pmu: Fix printed messages for emulated instruction test
  x86/pmu: Introduce __start_event() to drop all of the manual zeroing
  x86/pmu: Introduce multiple_{one, many}() to improve readability
  x86/pmu: Reset the expected count of the fixed counter 0 when i386
  x86: create pmu group for quick pmu-scope testing
  x86/pmu: Refine info to clarify the current support
  x86/pmu: Update rdpmc testcase to cover #GP path
  x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability
  x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files
  x86/pmu: Snapshot PMU perf_capabilities during BSP initialization
  x86/pmu: Track GP counter and event select base MSRs in pmu_caps
  x86/pmu: Add helper to get fixed counter MSR index
  x86/pmu: Track global status/control/clear MSRs in pmu_caps
  x86: Add tests for Guest Processor Event Based Sampling (PEBS)
  x86/pmu: Add global helpers to cover Intel Arch PMU Version 1
  x86/pmu: Add gp_events pointer to route different event tables
  x86/pmu: Update testcases to cover AMD PMU
  x86/pmu: Add AMD Guest PerfMonV2 testcases

Sean Christopherson (5):
  x86: Add a helper for the BSP's final init sequence common to all
    flavors
  x86/pmu: Snapshot CPUID.0xA PMU capabilities during BSP initialization
  x86/pmu: Drop wrappers that just passthrough pmu_caps fields
  x86/pmu: Reset GP and Fixed counters during pmu_init().
  x86/pmu: Add pmu_caps flag to track if CPU is Intel (versus AMD)

 lib/x86/asm/setup.h |   1 +
 lib/x86/msr.h       |  30 +++
 lib/x86/pmu.c       |  67 +++++++
 lib/x86/pmu.h       | 187 +++++++++++++++++++
 lib/x86/processor.h |  80 ++------
 lib/x86/setup.c     |  13 +-
 x86/Makefile.common |   1 +
 x86/Makefile.x86_64 |   1 +
 x86/cstart.S        |   4 +-
 x86/cstart64.S      |   4 +-
 x86/pmu.c           | 360 ++++++++++++++++++++----------------
 x86/pmu_lbr.c       |  24 +--
 x86/pmu_pebs.c      | 433 ++++++++++++++++++++++++++++++++++++++++++++
 x86/unittests.cfg   |  10 +
 x86/vmx_tests.c     |   1 +
 15 files changed, 975 insertions(+), 241 deletions(-)
 create mode 100644 lib/x86/pmu.c
 create mode 100644 lib/x86/pmu.h
 create mode 100644 x86/pmu_pebs.c


base-commit: 73d9d850f1c2c9f0df321967e67acda0d2c305ea
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 01/27] x86/pmu: Add PDCM check before accessing PERF_CAP register
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 02/27] x86/pmu: Test emulation instructions on full-width counters Sean Christopherson
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

On virtual platforms without PDCM support (e.g. AMD), #GP
failure on MSR_IA32_PERF_CAPABILITIES is completely avoidable.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/processor.h | 8 ++++++++
 x86/pmu.c           | 2 +-
 x86/pmu_lbr.c       | 2 +-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index 03242206..f85abe36 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -847,4 +847,12 @@ static inline bool pmu_gp_counter_is_available(int i)
 	return !(cpuid(10).b & BIT(i));
 }
 
+static inline u64 this_cpu_perf_capabilities(void)
+{
+	if (!this_cpu_has(X86_FEATURE_PDCM))
+		return 0;
+
+	return rdmsr(MSR_IA32_PERF_CAPABILITIES);
+}
+
 #endif
diff --git a/x86/pmu.c b/x86/pmu.c
index 6cadb590..1a3e5a54 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -660,7 +660,7 @@ int main(int ac, char **av)
 
 	check_counters();
 
-	if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES) {
+	if (this_cpu_perf_capabilities() & PMU_CAP_FW_WRITES) {
 		gp_counter_base = MSR_IA32_PMC0;
 		report_prefix_push("full-width writes");
 		check_counters();
diff --git a/x86/pmu_lbr.c b/x86/pmu_lbr.c
index 8dad1f1a..c040b146 100644
--- a/x86/pmu_lbr.c
+++ b/x86/pmu_lbr.c
@@ -72,7 +72,7 @@ int main(int ac, char **av)
 		return report_summary();
 	}
 
-	perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
+	perf_cap = this_cpu_perf_capabilities();
 
 	if (!(perf_cap & PMU_CAP_LBR_FMT)) {
 		report_skip("(Architectural) LBR is not supported.");
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 02/27] x86/pmu: Test emulation instructions on full-width counters
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 01/27] x86/pmu: Add PDCM check before accessing PERF_CAP register Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 03/27] x86/pmu: Pop up FW prefix to avoid out-of-context propagation Sean Christopherson
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Move check_emulated_instr() into check_counters() so that full-width
counters could be tested with ease by the same test case.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 1a3e5a54..308a0ce0 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -520,6 +520,9 @@ static void check_emulated_instr(void)
 
 static void check_counters(void)
 {
+	if (is_fep_available())
+		check_emulated_instr();
+
 	check_gp_counters();
 	check_fixed_counters();
 	check_rdpmc();
@@ -655,9 +658,6 @@ int main(int ac, char **av)
 
 	apic_write(APIC_LVTPC, PC_VECTOR);
 
-	if (is_fep_available())
-		check_emulated_instr();
-
 	check_counters();
 
 	if (this_cpu_perf_capabilities() & PMU_CAP_FW_WRITES) {
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 03/27] x86/pmu: Pop up FW prefix to avoid out-of-context propagation
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 01/27] x86/pmu: Add PDCM check before accessing PERF_CAP register Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 02/27] x86/pmu: Test emulation instructions on full-width counters Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 04/27] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms Sean Christopherson
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

The inappropriate prefix "full-width writes" may be propagated to
later test cases if it is not popped out.

Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/x86/pmu.c b/x86/pmu.c
index 308a0ce0..da8c004a 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -665,6 +665,7 @@ int main(int ac, char **av)
 		report_prefix_push("full-width writes");
 		check_counters();
 		check_gp_counters_write_width();
+		report_prefix_pop();
 	}
 
 	return report_summary();
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 04/27] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (2 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 03/27] x86/pmu: Pop up FW prefix to avoid out-of-context propagation Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 05/27] x86/pmu: Fix printed messages for emulated instruction test Sean Christopherson
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

The test conclusion of running Intel LBR on AMD platforms
should not be PASS, but SKIP, fix it.

Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu_lbr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86/pmu_lbr.c b/x86/pmu_lbr.c
index c040b146..a641d793 100644
--- a/x86/pmu_lbr.c
+++ b/x86/pmu_lbr.c
@@ -59,7 +59,7 @@ int main(int ac, char **av)
 
 	if (!is_intel()) {
 		report_skip("PMU_LBR test is for intel CPU's only");
-		return 0;
+		return report_summary();
 	}
 
 	if (!this_cpu_has_pmu()) {
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 05/27] x86/pmu: Fix printed messages for emulated instruction test
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (3 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 04/27] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 06/27] x86/pmu: Introduce __start_event() to drop all of the manual zeroing Sean Christopherson
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

This test case uses MSR_IA32_PERFCTR0 to count branch instructions
and PERFCTR1 to count instruction events. The same correspondence
should be maintained at report(), specifically this should use
status bit 1 for instructions and bit 0 for branches.

Fixes: 20cf914 ("x86/pmu: Test PMU virtualization on emulated instructions")
Reported-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index da8c004a..715b45a3 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -512,8 +512,8 @@ static void check_emulated_instr(void)
 	       "branch count");
 	// Additionally check that those counters overflowed properly.
 	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
-	report(status & 1, "instruction counter overflow");
-	report(status & 2, "branch counter overflow");
+	report(status & 1, "branch counter overflow");
+	report(status & 2, "instruction counter overflow");
 
 	report_prefix_pop();
 }
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 06/27] x86/pmu: Introduce __start_event() to drop all of the manual zeroing
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (4 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 05/27] x86/pmu: Fix printed messages for emulated instruction test Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 07/27] x86/pmu: Introduce multiple_{one, many}() to improve readability Sean Christopherson
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Most invocation of start_event() and measure() first sets evt.count=0.
Instead of forcing each caller to ensure count is zeroed, optimize the
count to zero during start_event(), then drop all of the manual zeroing.

Accumulating counts can be handled by reading the current count before
start_event(), and doing something like stuffing a high count to test an
edge case could be handled by an inner helper, __start_event().

For overflow, just open code measure() for that one-off case. Requiring
callers to zero out a field in most common cases isn't exactly flexible.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
[sean: tag __measure() noinline so its count is stable]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 715b45a3..77e59c5c 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -137,9 +137,9 @@ static void global_disable(pmu_counter_t *cnt)
 			~(1ull << cnt->idx));
 }
 
-
-static void start_event(pmu_counter_t *evt)
+static void __start_event(pmu_counter_t *evt, uint64_t count)
 {
+    evt->count = count;
     wrmsr(evt->ctr, evt->count);
     if (is_gp(evt))
 	    wrmsr(MSR_P6_EVNTSEL0 + event_to_global_idx(evt),
@@ -162,6 +162,11 @@ static void start_event(pmu_counter_t *evt)
     apic_write(APIC_LVTPC, PC_VECTOR);
 }
 
+static void start_event(pmu_counter_t *evt)
+{
+	__start_event(evt, 0);
+}
+
 static void stop_event(pmu_counter_t *evt)
 {
 	global_disable(evt);
@@ -186,6 +191,13 @@ static noinline void measure(pmu_counter_t *evt, int count)
 		stop_event(&evt[i]);
 }
 
+static noinline void __measure(pmu_counter_t *evt, uint64_t count)
+{
+	__start_event(evt, count);
+	loop();
+	stop_event(evt);
+}
+
 static bool verify_event(uint64_t count, struct pmu_event *e)
 {
 	// printf("%d <= %ld <= %d\n", e->min, count, e->max);
@@ -208,7 +220,6 @@ static void check_gp_counter(struct pmu_event *evt)
 	int i;
 
 	for (i = 0; i < nr_gp_counters; i++, cnt.ctr++) {
-		cnt.count = 0;
 		measure(&cnt, 1);
 		report(verify_event(cnt.count, evt), "%s-%d", evt->name, i);
 	}
@@ -235,7 +246,6 @@ static void check_fixed_counters(void)
 	int i;
 
 	for (i = 0; i < nr_fixed_counters; i++) {
-		cnt.count = 0;
 		cnt.ctr = fixed_events[i].unit_sel;
 		measure(&cnt, 1);
 		report(verify_event(cnt.count, &fixed_events[i]), "fixed-%d", i);
@@ -253,14 +263,12 @@ static void check_counters_many(void)
 		if (!pmu_gp_counter_is_available(i))
 			continue;
 
-		cnt[n].count = 0;
 		cnt[n].ctr = gp_counter_base + n;
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR |
 			gp_events[i % ARRAY_SIZE(gp_events)].unit_sel;
 		n++;
 	}
 	for (i = 0; i < nr_fixed_counters; i++) {
-		cnt[n].count = 0;
 		cnt[n].ctr = fixed_events[i].unit_sel;
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR;
 		n++;
@@ -283,9 +291,8 @@ static void check_counter_overflow(void)
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
-		.count = 0,
 	};
-	measure(&cnt, 1);
+	__measure(&cnt, 0);
 	count = cnt.count;
 
 	/* clear status before test */
@@ -311,7 +318,7 @@ static void check_counter_overflow(void)
 		else
 			cnt.config &= ~EVNTSEL_INT;
 		idx = event_to_global_idx(&cnt);
-		measure(&cnt, 1);
+		__measure(&cnt, cnt.count);
 		report(cnt.count == 1, "cntr-%d", i);
 		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
 		report(status & (1ull << idx), "status-%d", i);
@@ -329,7 +336,6 @@ static void check_gp_counter_cmask(void)
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
-		.count = 0,
 	};
 	cnt.config |= (0x2 << EVNTSEL_CMASK_SHIFT);
 	measure(&cnt, 1);
@@ -415,7 +421,6 @@ static void check_running_counter_wrmsr(void)
 	pmu_counter_t evt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
-		.count = 0,
 	};
 
 	report_prefix_push("running counter wrmsr");
@@ -430,7 +435,6 @@ static void check_running_counter_wrmsr(void)
 	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
 	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
 
-	evt.count = 0;
 	start_event(&evt);
 
 	count = -1;
@@ -454,13 +458,11 @@ static void check_emulated_instr(void)
 		.ctr = MSR_IA32_PERFCTR0,
 		/* branch instructions */
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[5].unit_sel,
-		.count = 0,
 	};
 	pmu_counter_t instr_cnt = {
 		.ctr = MSR_IA32_PERFCTR0 + 1,
 		/* instructions */
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
-		.count = 0,
 	};
 	report_prefix_push("emulated instruction");
 
@@ -592,7 +594,6 @@ static void set_ref_cycle_expectations(void)
 	pmu_counter_t cnt = {
 		.ctr = MSR_IA32_PERFCTR0,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[2].unit_sel,
-		.count = 0,
 	};
 	uint64_t tsc_delta;
 	uint64_t t0, t1, t2, t3;
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 07/27] x86/pmu: Introduce multiple_{one, many}() to improve readability
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (5 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 06/27] x86/pmu: Introduce __start_event() to drop all of the manual zeroing Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 08/27] x86/pmu: Reset the expected count of the fixed counter 0 when i386 Sean Christopherson
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

The current measure_one() forces the common case to pass in unnecessary
information in order to give flexibility to a single use case. It's just
syntatic sugar, but it really does help readers as it's not obvious that
the "1" specifies the number of events, whereas multiple_many() and
measure_one() are relatively self-explanatory.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 77e59c5c..0546eb13 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -181,7 +181,7 @@ static void stop_event(pmu_counter_t *evt)
 	evt->count = rdmsr(evt->ctr);
 }
 
-static noinline void measure(pmu_counter_t *evt, int count)
+static noinline void measure_many(pmu_counter_t *evt, int count)
 {
 	int i;
 	for (i = 0; i < count; i++)
@@ -191,6 +191,11 @@ static noinline void measure(pmu_counter_t *evt, int count)
 		stop_event(&evt[i]);
 }
 
+static void measure_one(pmu_counter_t *evt)
+{
+	measure_many(evt, 1);
+}
+
 static noinline void __measure(pmu_counter_t *evt, uint64_t count)
 {
 	__start_event(evt, count);
@@ -220,7 +225,7 @@ static void check_gp_counter(struct pmu_event *evt)
 	int i;
 
 	for (i = 0; i < nr_gp_counters; i++, cnt.ctr++) {
-		measure(&cnt, 1);
+		measure_one(&cnt);
 		report(verify_event(cnt.count, evt), "%s-%d", evt->name, i);
 	}
 }
@@ -247,7 +252,7 @@ static void check_fixed_counters(void)
 
 	for (i = 0; i < nr_fixed_counters; i++) {
 		cnt.ctr = fixed_events[i].unit_sel;
-		measure(&cnt, 1);
+		measure_one(&cnt);
 		report(verify_event(cnt.count, &fixed_events[i]), "fixed-%d", i);
 	}
 }
@@ -274,7 +279,7 @@ static void check_counters_many(void)
 		n++;
 	}
 
-	measure(cnt, n);
+	measure_many(cnt, n);
 
 	for (i = 0; i < n; i++)
 		if (!verify_counter(&cnt[i]))
@@ -338,7 +343,7 @@ static void check_gp_counter_cmask(void)
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
 	};
 	cnt.config |= (0x2 << EVNTSEL_CMASK_SHIFT);
-	measure(&cnt, 1);
+	measure_one(&cnt);
 	report(cnt.count < gp_events[1].min, "cmask");
 }
 
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 08/27] x86/pmu: Reset the expected count of the fixed counter 0 when i386
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (6 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 07/27] x86/pmu: Introduce multiple_{one, many}() to improve readability Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 09/27] x86: create pmu group for quick pmu-scope testing Sean Christopherson
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

The pmu test check_counter_overflow() always fails with 32-bit binaries.
The cnt.count obtained from the latter run of measure() (based on fixed
counter 0) is not equal to the expected value (based on gp counter 0) and
there is a positive error with a value of 2.

The two extra instructions come from inline wrmsr() and inline rdmsr()
inside the global_disable() binary code block. Specifically, for each msr
access, the i386 code will have two assembly mov instructions before
rdmsr/wrmsr (mark it for fixed counter 0, bit 32), but only one assembly
mov is needed for x86_64 and gp counter 0 on i386.

The sequence of instructions to count events using the #GP and #Fixed
counters is different. Thus the fix is quite high level, to use the same
counter (w/ same instruction sequences) to set initial value for the same
counter. Fix the expected init cnt.count for fixed counter 0 overflow
based on the same fixed counter 0, not always using gp counter 0.

The difference of 1 for this count enables the interrupt to be generated
immediately after the selected event count has been reached, instead of
waiting for the overflow to be propagation through the counter.

Adding a helper to measure/compute the overflow preset value. It
provides a convenient location to document the weird behavior
that's necessary to ensure immediate event delivery.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 0546eb13..ddbc0cf9 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -288,17 +288,30 @@ static void check_counters_many(void)
 	report(i == n, "all counters");
 }
 
+static uint64_t measure_for_overflow(pmu_counter_t *cnt)
+{
+	__measure(cnt, 0);
+	/*
+	 * To generate overflow, i.e. roll over to '0', the initial count just
+	 * needs to be preset to the negative expected count.  However, as per
+	 * Intel's SDM, the preset count needs to be incremented by 1 to ensure
+	 * the overflow interrupt is generated immediately instead of possibly
+	 * waiting for the overflow to propagate through the counter.
+	 */
+	assert(cnt->count > 1);
+	return 1 - cnt->count;
+}
+
 static void check_counter_overflow(void)
 {
 	int nr_gp_counters = pmu_nr_gp_counters();
-	uint64_t count;
+	uint64_t overflow_preset;
 	int i;
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
 	};
-	__measure(&cnt, 0);
-	count = cnt.count;
+	overflow_preset = measure_for_overflow(&cnt);
 
 	/* clear status before test */
 	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
@@ -309,12 +322,13 @@ static void check_counter_overflow(void)
 		uint64_t status;
 		int idx;
 
-		cnt.count = 1 - count;
+		cnt.count = overflow_preset;
 		if (gp_counter_base == MSR_IA32_PMC0)
 			cnt.count &= (1ull << pmu_gp_counter_width()) - 1;
 
 		if (i == nr_gp_counters) {
 			cnt.ctr = fixed_events[0].unit_sel;
+			cnt.count = measure_for_overflow(&cnt);
 			cnt.count &= (1ull << pmu_fixed_counter_width()) - 1;
 		}
 
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 09/27] x86: create pmu group for quick pmu-scope testing
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (7 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 08/27] x86/pmu: Reset the expected count of the fixed counter 0 when i386 Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 10/27] x86/pmu: Refine info to clarify the current support Sean Christopherson
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Any agent can run "./run_tests.sh -g pmu" to run all PMU tests easily,
e.g. when verifying the x86/PMU KVM changes.

Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/unittests.cfg | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index ed651850..07d05070 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -189,6 +189,7 @@ file = pmu.flat
 extra_params = -cpu max
 check = /proc/sys/kernel/nmi_watchdog=0
 accel = kvm
+groups = pmu
 
 [pmu_lbr]
 arch = x86_64
@@ -197,6 +198,7 @@ extra_params = -cpu host,migratable=no
 check = /sys/module/kvm/parameters/ignore_msrs=N
 check = /proc/sys/kernel/nmi_watchdog=0
 accel = kvm
+groups = pmu
 
 [vmware_backdoors]
 file = vmware_backdoors.flat
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 10/27] x86/pmu: Refine info to clarify the current support
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (8 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 09/27] x86: create pmu group for quick pmu-scope testing Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 11/27] x86/pmu: Update rdpmc testcase to cover #GP path Sean Christopherson
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Existing unit tests do not cover AMD pmu, nor Intel pmu that is not
architecture (on some obsolete cpu's). AMD's PMU support will be
coming in subsequent commits.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index ddbc0cf9..5fa6a952 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -658,7 +658,7 @@ int main(int ac, char **av)
 	buf = malloc(N*64);
 
 	if (!pmu_version()) {
-		report_skip("No pmu is detected!");
+		report_skip("No Intel Arch PMU is detected!");
 		return report_summary();
 	}
 
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 11/27] x86/pmu: Update rdpmc testcase to cover #GP path
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (9 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 10/27] x86/pmu: Refine info to clarify the current support Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-24 11:33   ` Thomas Huth
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 12/27] x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability Sean Christopherson
                   ` (17 subsequent siblings)
  28 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Specifying an unsupported PMC encoding will cause a #GP(0).

There are multiple reasons RDPMC can #GP, the one that is being relied
on to guarantee #GP is specifically that the PMC is invalid. The most
extensible solution is to provide a safe variant.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/processor.h | 21 ++++++++++++++++++---
 x86/pmu.c           | 10 ++++++++++
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index f85abe36..ba14c7a0 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -438,11 +438,26 @@ static inline int wrmsr_safe(u32 index, u64 val)
 	return exception_vector();
 }
 
+static inline int rdpmc_safe(u32 index, uint64_t *val)
+{
+	uint32_t a, d;
+
+	asm volatile (ASM_TRY("1f")
+		      "rdpmc\n\t"
+		      "1:"
+		      : "=a"(a), "=d"(d) : "c"(index) : "memory");
+	*val = (uint64_t)a | ((uint64_t)d << 32);
+	return exception_vector();
+}
+
 static inline uint64_t rdpmc(uint32_t index)
 {
-	uint32_t a, d;
-	asm volatile ("rdpmc" : "=a"(a), "=d"(d) : "c"(index));
-	return a | ((uint64_t)d << 32);
+	uint64_t val;
+	int vector = rdpmc_safe(index, &val);
+
+	assert_msg(!vector, "Unexpected %s on RDPMC(%d)",
+		   exception_mnemonic(vector), index);
+	return val;
 }
 
 static inline int write_cr0_safe(ulong val)
diff --git a/x86/pmu.c b/x86/pmu.c
index 5fa6a952..03061388 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -651,12 +651,22 @@ static void set_ref_cycle_expectations(void)
 	gp_events[2].max = (gp_events[2].max * cnt.count) / tsc_delta;
 }
 
+static void check_invalid_rdpmc_gp(void)
+{
+	uint64_t val;
+
+	report(rdpmc_safe(64, &val) == GP_VECTOR,
+	       "Expected #GP on RDPMC(64)");
+}
+
 int main(int ac, char **av)
 {
 	setup_vm();
 	handle_irq(PC_VECTOR, cnt_overflow);
 	buf = malloc(N*64);
 
+	check_invalid_rdpmc_gp();
+
 	if (!pmu_version()) {
 		report_skip("No Intel Arch PMU is detected!");
 		return report_summary();
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 12/27] x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (10 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 11/27] x86/pmu: Update rdpmc testcase to cover #GP path Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 13/27] x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files Sean Christopherson
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

The original name "PC_VECTOR" comes from the LVT Performance
Counter Register. Rename it to PMI_VECTOR. That's much more familiar
for KVM developers and it's still correct, e.g. it's the PMI vector
that's programmed into the LVT PC register.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 03061388..b5828a14 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -11,7 +11,9 @@
 #include <stdint.h>
 
 #define FIXED_CNT_INDEX 32
-#define PC_VECTOR	32
+
+/* Performance Counter Vector for the LVT PC Register */
+#define PMI_VECTOR	32
 
 #define EVNSEL_EVENT_SHIFT	0
 #define EVNTSEL_UMASK_SHIFT	8
@@ -159,7 +161,7 @@ static void __start_event(pmu_counter_t *evt, uint64_t count)
 	    wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, ctrl);
     }
     global_enable(evt);
-    apic_write(APIC_LVTPC, PC_VECTOR);
+    apic_write(APIC_LVTPC, PMI_VECTOR);
 }
 
 static void start_event(pmu_counter_t *evt)
@@ -662,7 +664,7 @@ static void check_invalid_rdpmc_gp(void)
 int main(int ac, char **av)
 {
 	setup_vm();
-	handle_irq(PC_VECTOR, cnt_overflow);
+	handle_irq(PMI_VECTOR, cnt_overflow);
 	buf = malloc(N*64);
 
 	check_invalid_rdpmc_gp();
@@ -686,7 +688,7 @@ int main(int ac, char **av)
 	printf("Fixed counters:      %d\n", pmu_nr_fixed_counters());
 	printf("Fixed counter width: %d\n", pmu_fixed_counter_width());
 
-	apic_write(APIC_LVTPC, PC_VECTOR);
+	apic_write(APIC_LVTPC, PMI_VECTOR);
 
 	check_counters();
 
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 13/27] x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (11 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 12/27] x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 14/27] x86: Add a helper for the BSP's final init sequence common to all flavors Sean Christopherson
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Given all the PMU stuff coming in, we need e.g. lib/x86/pmu.h to hold all
of the hardware-defined stuff, e.g. #defines, accessors, helpers and structs
that are dictated by hardware. This will greatly help with code reuse and
reduce unnecessary vm-exit.

Opportunistically move lbr msrs definition to header processor.h.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/msr.h       |   7 ++++
 lib/x86/pmu.c       |   1 +
 lib/x86/pmu.h       | 100 ++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/processor.h |  64 ----------------------------
 x86/Makefile.common |   1 +
 x86/pmu.c           |  25 +----------
 x86/pmu_lbr.c       |  11 +----
 x86/vmx_tests.c     |   1 +
 8 files changed, 112 insertions(+), 98 deletions(-)
 create mode 100644 lib/x86/pmu.c
 create mode 100644 lib/x86/pmu.h

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index fa1c0c81..bbe29fd9 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -86,6 +86,13 @@
 #define DEBUGCTLMSR_BTS_OFF_USR		(1UL << 10)
 #define DEBUGCTLMSR_FREEZE_LBRS_ON_PMI	(1UL << 11)
 
+#define MSR_LBR_NHM_FROM	0x00000680
+#define MSR_LBR_NHM_TO		0x000006c0
+#define MSR_LBR_CORE_FROM	0x00000040
+#define MSR_LBR_CORE_TO	0x00000060
+#define MSR_LBR_TOS		0x000001c9
+#define MSR_LBR_SELECT		0x000001c8
+
 #define MSR_IA32_MC0_CTL		0x00000400
 #define MSR_IA32_MC0_STATUS		0x00000401
 #define MSR_IA32_MC0_ADDR		0x00000402
diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
new file mode 100644
index 00000000..9d048abc
--- /dev/null
+++ b/lib/x86/pmu.c
@@ -0,0 +1 @@
+#include "pmu.h"
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
new file mode 100644
index 00000000..078a9747
--- /dev/null
+++ b/lib/x86/pmu.h
@@ -0,0 +1,100 @@
+#ifndef _X86_PMU_H_
+#define _X86_PMU_H_
+
+#include "processor.h"
+#include "libcflat.h"
+
+#define FIXED_CNT_INDEX 32
+#define MAX_NUM_LBR_ENTRY	  32
+
+/* Performance Counter Vector for the LVT PC Register */
+#define PMI_VECTOR	32
+
+#define DEBUGCTLMSR_LBR	  (1UL <<  0)
+
+#define PMU_CAP_LBR_FMT	  0x3f
+#define PMU_CAP_FW_WRITES	(1ULL << 13)
+
+#define EVNSEL_EVENT_SHIFT	0
+#define EVNTSEL_UMASK_SHIFT	8
+#define EVNTSEL_USR_SHIFT	16
+#define EVNTSEL_OS_SHIFT	17
+#define EVNTSEL_EDGE_SHIFT	18
+#define EVNTSEL_PC_SHIFT	19
+#define EVNTSEL_INT_SHIFT	20
+#define EVNTSEL_EN_SHIF		22
+#define EVNTSEL_INV_SHIF	23
+#define EVNTSEL_CMASK_SHIFT	24
+
+#define EVNTSEL_EN	(1 << EVNTSEL_EN_SHIF)
+#define EVNTSEL_USR	(1 << EVNTSEL_USR_SHIFT)
+#define EVNTSEL_OS	(1 << EVNTSEL_OS_SHIFT)
+#define EVNTSEL_PC	(1 << EVNTSEL_PC_SHIFT)
+#define EVNTSEL_INT	(1 << EVNTSEL_INT_SHIFT)
+#define EVNTSEL_INV	(1 << EVNTSEL_INV_SHIF)
+
+static inline u8 pmu_version(void)
+{
+	return cpuid(10).a & 0xff;
+}
+
+static inline bool this_cpu_has_pmu(void)
+{
+	return !!pmu_version();
+}
+
+static inline bool this_cpu_has_perf_global_ctrl(void)
+{
+	return pmu_version() > 1;
+}
+
+static inline u8 pmu_nr_gp_counters(void)
+{
+	return (cpuid(10).a >> 8) & 0xff;
+}
+
+static inline u8 pmu_gp_counter_width(void)
+{
+	return (cpuid(10).a >> 16) & 0xff;
+}
+
+static inline u8 pmu_gp_counter_mask_length(void)
+{
+	return (cpuid(10).a >> 24) & 0xff;
+}
+
+static inline u8 pmu_nr_fixed_counters(void)
+{
+	struct cpuid id = cpuid(10);
+
+	if ((id.a & 0xff) > 1)
+		return id.d & 0x1f;
+	else
+		return 0;
+}
+
+static inline u8 pmu_fixed_counter_width(void)
+{
+	struct cpuid id = cpuid(10);
+
+	if ((id.a & 0xff) > 1)
+		return (id.d >> 5) & 0xff;
+	else
+		return 0;
+}
+
+static inline bool pmu_gp_counter_is_available(int i)
+{
+	/* CPUID.0xA.EBX bit is '1 if they counter is NOT available. */
+	return !(cpuid(10).b & BIT(i));
+}
+
+static inline u64 this_cpu_perf_capabilities(void)
+{
+	if (!this_cpu_has(X86_FEATURE_PDCM))
+		return 0;
+
+	return rdmsr(MSR_IA32_PERF_CAPABILITIES);
+}
+
+#endif /* _X86_PMU_H_ */
diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index ba14c7a0..c0716663 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -806,68 +806,4 @@ static inline void flush_tlb(void)
 	write_cr4(cr4);
 }
 
-static inline u8 pmu_version(void)
-{
-	return cpuid(10).a & 0xff;
-}
-
-static inline bool this_cpu_has_pmu(void)
-{
-	return !!pmu_version();
-}
-
-static inline bool this_cpu_has_perf_global_ctrl(void)
-{
-	return pmu_version() > 1;
-}
-
-static inline u8 pmu_nr_gp_counters(void)
-{
-	return (cpuid(10).a >> 8) & 0xff;
-}
-
-static inline u8 pmu_gp_counter_width(void)
-{
-	return (cpuid(10).a >> 16) & 0xff;
-}
-
-static inline u8 pmu_gp_counter_mask_length(void)
-{
-	return (cpuid(10).a >> 24) & 0xff;
-}
-
-static inline u8 pmu_nr_fixed_counters(void)
-{
-	struct cpuid id = cpuid(10);
-
-	if ((id.a & 0xff) > 1)
-		return id.d & 0x1f;
-	else
-		return 0;
-}
-
-static inline u8 pmu_fixed_counter_width(void)
-{
-	struct cpuid id = cpuid(10);
-
-	if ((id.a & 0xff) > 1)
-		return (id.d >> 5) & 0xff;
-	else
-		return 0;
-}
-
-static inline bool pmu_gp_counter_is_available(int i)
-{
-	/* CPUID.0xA.EBX bit is '1 if they counter is NOT available. */
-	return !(cpuid(10).b & BIT(i));
-}
-
-static inline u64 this_cpu_perf_capabilities(void)
-{
-	if (!this_cpu_has(X86_FEATURE_PDCM))
-		return 0;
-
-	return rdmsr(MSR_IA32_PERF_CAPABILITIES);
-}
-
 #endif
diff --git a/x86/Makefile.common b/x86/Makefile.common
index b7010e2f..8cbdd2a9 100644
--- a/x86/Makefile.common
+++ b/x86/Makefile.common
@@ -22,6 +22,7 @@ cflatobjs += lib/x86/acpi.o
 cflatobjs += lib/x86/stack.o
 cflatobjs += lib/x86/fault_test.o
 cflatobjs += lib/x86/delay.o
+cflatobjs += lib/x86/pmu.o
 ifeq ($(CONFIG_EFI),y)
 cflatobjs += lib/x86/amd_sev.o
 cflatobjs += lib/efi.o
diff --git a/x86/pmu.c b/x86/pmu.c
index b5828a14..7d67746e 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -1,6 +1,7 @@
 
 #include "x86/msr.h"
 #include "x86/processor.h"
+#include "x86/pmu.h"
 #include "x86/apic-defs.h"
 #include "x86/apic.h"
 #include "x86/desc.h"
@@ -10,29 +11,6 @@
 #include "libcflat.h"
 #include <stdint.h>
 
-#define FIXED_CNT_INDEX 32
-
-/* Performance Counter Vector for the LVT PC Register */
-#define PMI_VECTOR	32
-
-#define EVNSEL_EVENT_SHIFT	0
-#define EVNTSEL_UMASK_SHIFT	8
-#define EVNTSEL_USR_SHIFT	16
-#define EVNTSEL_OS_SHIFT	17
-#define EVNTSEL_EDGE_SHIFT	18
-#define EVNTSEL_PC_SHIFT	19
-#define EVNTSEL_INT_SHIFT	20
-#define EVNTSEL_EN_SHIF		22
-#define EVNTSEL_INV_SHIF	23
-#define EVNTSEL_CMASK_SHIFT	24
-
-#define EVNTSEL_EN	(1 << EVNTSEL_EN_SHIF)
-#define EVNTSEL_USR	(1 << EVNTSEL_USR_SHIFT)
-#define EVNTSEL_OS	(1 << EVNTSEL_OS_SHIFT)
-#define EVNTSEL_PC	(1 << EVNTSEL_PC_SHIFT)
-#define EVNTSEL_INT	(1 << EVNTSEL_INT_SHIFT)
-#define EVNTSEL_INV	(1 << EVNTSEL_INV_SHIF)
-
 #define N 1000000
 
 // These values match the number of instructions and branches in the
@@ -66,7 +44,6 @@ struct pmu_event {
 	{"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
 };
 
-#define PMU_CAP_FW_WRITES	(1ULL << 13)
 static u64 gp_counter_base = MSR_IA32_PERFCTR0;
 
 char *buf;
diff --git a/x86/pmu_lbr.c b/x86/pmu_lbr.c
index a641d793..e6d98236 100644
--- a/x86/pmu_lbr.c
+++ b/x86/pmu_lbr.c
@@ -1,18 +1,9 @@
 #include "x86/msr.h"
 #include "x86/processor.h"
+#include "x86/pmu.h"
 #include "x86/desc.h"
 
 #define N 1000000
-#define MAX_NUM_LBR_ENTRY	  32
-#define DEBUGCTLMSR_LBR	  (1UL <<  0)
-#define PMU_CAP_LBR_FMT	  0x3f
-
-#define MSR_LBR_NHM_FROM	0x00000680
-#define MSR_LBR_NHM_TO		0x000006c0
-#define MSR_LBR_CORE_FROM	0x00000040
-#define MSR_LBR_CORE_TO	0x00000060
-#define MSR_LBR_TOS		0x000001c9
-#define MSR_LBR_SELECT		0x000001c8
 
 volatile int count;
 u32 lbr_from, lbr_to;
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index aa2ecbbc..fd36e436 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -9,6 +9,7 @@
 #include "vmx.h"
 #include "msr.h"
 #include "processor.h"
+#include "pmu.h"
 #include "vm.h"
 #include "pci.h"
 #include "fwcfg.h"
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 14/27] x86: Add a helper for the BSP's final init sequence common to all flavors
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (12 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 13/27] x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 15/27] x86/pmu: Snapshot PMU perf_capabilities during BSP initialization Sean Christopherson
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

Add bsp_rest_init() to dedup bringing up APs and doing SMP initialization
across 32-bit, 64-bit, and EFI flavors of KVM-unit-tests.  The common
bucket will also be used in future to patches to init things that aren't
SMP related and thus don't fit in smp_init(), e.g. PMU setup.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/asm/setup.h |  1 +
 lib/x86/setup.c     | 11 ++++++++---
 x86/cstart.S        |  4 +---
 x86/cstart64.S      |  4 +---
 4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/lib/x86/asm/setup.h b/lib/x86/asm/setup.h
index 8502e7d9..1f384274 100644
--- a/lib/x86/asm/setup.h
+++ b/lib/x86/asm/setup.h
@@ -17,6 +17,7 @@ void setup_5level_page_table(void);
 #endif /* CONFIG_EFI */
 
 void save_id(void);
+void bsp_rest_init(void);
 void ap_start64(void);
 
 #endif /* _X86_ASM_SETUP_H_ */
diff --git a/lib/x86/setup.c b/lib/x86/setup.c
index 7df0256e..a7b3edbe 100644
--- a/lib/x86/setup.c
+++ b/lib/x86/setup.c
@@ -356,9 +356,7 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo)
 	setup_page_table();
 	enable_apic();
 	save_id();
-	bringup_aps();
-	enable_x2apic();
-	smp_init();
+	bsp_rest_init();
 
 	return EFI_SUCCESS;
 }
@@ -394,3 +392,10 @@ void ap_start64(void)
 	enable_x2apic();
 	ap_online();
 }
+
+void bsp_rest_init(void)
+{
+	bringup_aps();
+	enable_x2apic();
+	smp_init();
+}
diff --git a/x86/cstart.S b/x86/cstart.S
index e82bed7b..ceee58f9 100644
--- a/x86/cstart.S
+++ b/x86/cstart.S
@@ -112,9 +112,7 @@ start32:
 	call save_id
 	call mask_pic_interrupts
 	call enable_apic
-	call bringup_aps
-	call enable_x2apic
-	call smp_init
+	call bsp_rest_init
         push $__environ
         push $__argv
         push __argc
diff --git a/x86/cstart64.S b/x86/cstart64.S
index 570ed2ed..4dff1102 100644
--- a/x86/cstart64.S
+++ b/x86/cstart64.S
@@ -118,9 +118,7 @@ start64:
 	mov %rax, __args(%rip)
 	call __setup_args
 
-	call bringup_aps
-	call enable_x2apic
-	call smp_init
+	call bsp_rest_init
 
 	mov __argc(%rip), %edi
 	lea __argv(%rip), %rsi
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 15/27] x86/pmu: Snapshot PMU perf_capabilities during BSP initialization
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (13 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 14/27] x86: Add a helper for the BSP's final init sequence common to all flavors Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 16/27] x86/pmu: Snapshot CPUID.0xA PMU capabilities " Sean Christopherson
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Add a global "struct pmu_caps pmu" to snapshot PMU capabilities
during the final stages of BSP initialization.  Use the new hooks to
snapshot PERF_CAPABILITIES instead of re-reading the MSR every time a
test wants to query capabilities.  A software-defined struct will also
simplify extending support to AMD CPUs, as many of the differences
between AMD and Intel can be handled during pmu_init().

Init the PMU caps for all tests so that tests don't need to remember to
call pmu_init() before using any of the PMU helpers, e.g. the nVMX test
uses this_cpu_has_pmu(), which will be converted to rely on the global
struct in a future patch.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
[sean: reword changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.c   |  8 ++++++++
 lib/x86/pmu.h   | 21 ++++++++++++++++++---
 lib/x86/setup.c |  2 ++
 x86/pmu.c       |  2 +-
 x86/pmu_lbr.c   |  7 ++-----
 5 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index 9d048abc..bb272ab7 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -1 +1,9 @@
 #include "pmu.h"
+
+struct pmu_caps pmu;
+
+void pmu_init(void)
+{
+	if (this_cpu_has(X86_FEATURE_PDCM))
+		pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
+}
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index 078a9747..4780237c 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -33,6 +33,14 @@
 #define EVNTSEL_INT	(1 << EVNTSEL_INT_SHIFT)
 #define EVNTSEL_INV	(1 << EVNTSEL_INV_SHIF)
 
+struct pmu_caps {
+	u64 perf_cap;
+};
+
+extern struct pmu_caps pmu;
+
+void pmu_init(void);
+
 static inline u8 pmu_version(void)
 {
 	return cpuid(10).a & 0xff;
@@ -91,10 +99,17 @@ static inline bool pmu_gp_counter_is_available(int i)
 
 static inline u64 this_cpu_perf_capabilities(void)
 {
-	if (!this_cpu_has(X86_FEATURE_PDCM))
-		return 0;
+	return pmu.perf_cap;
+}
 
-	return rdmsr(MSR_IA32_PERF_CAPABILITIES);
+static inline u64 pmu_lbr_version(void)
+{
+	return this_cpu_perf_capabilities() & PMU_CAP_LBR_FMT;
+}
+
+static inline bool pmu_has_full_writes(void)
+{
+	return this_cpu_perf_capabilities() & PMU_CAP_FW_WRITES;
 }
 
 #endif /* _X86_PMU_H_ */
diff --git a/lib/x86/setup.c b/lib/x86/setup.c
index a7b3edbe..1ebbf58a 100644
--- a/lib/x86/setup.c
+++ b/lib/x86/setup.c
@@ -15,6 +15,7 @@
 #include "apic-defs.h"
 #include "asm/setup.h"
 #include "atomic.h"
+#include "pmu.h"
 #include "processor.h"
 #include "smp.h"
 
@@ -398,4 +399,5 @@ void bsp_rest_init(void)
 	bringup_aps();
 	enable_x2apic();
 	smp_init();
+	pmu_init();
 }
diff --git a/x86/pmu.c b/x86/pmu.c
index 7d67746e..627fd394 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -669,7 +669,7 @@ int main(int ac, char **av)
 
 	check_counters();
 
-	if (this_cpu_perf_capabilities() & PMU_CAP_FW_WRITES) {
+	if (pmu_has_full_writes()) {
 		gp_counter_base = MSR_IA32_PMC0;
 		report_prefix_push("full-width writes");
 		check_counters();
diff --git a/x86/pmu_lbr.c b/x86/pmu_lbr.c
index e6d98236..d0135520 100644
--- a/x86/pmu_lbr.c
+++ b/x86/pmu_lbr.c
@@ -43,7 +43,6 @@ static bool test_init_lbr_from_exception(u64 index)
 
 int main(int ac, char **av)
 {
-	u64 perf_cap;
 	int max, i;
 
 	setup_vm();
@@ -63,15 +62,13 @@ int main(int ac, char **av)
 		return report_summary();
 	}
 
-	perf_cap = this_cpu_perf_capabilities();
-
-	if (!(perf_cap & PMU_CAP_LBR_FMT)) {
+	if (!pmu_lbr_version()) {
 		report_skip("(Architectural) LBR is not supported.");
 		return report_summary();
 	}
 
 	printf("PMU version:		 %d\n", pmu_version());
-	printf("LBR version:		 %ld\n", perf_cap & PMU_CAP_LBR_FMT);
+	printf("LBR version:		 %ld\n", pmu_lbr_version());
 
 	/* Look for LBR from and to MSRs */
 	lbr_from = MSR_LBR_CORE_FROM;
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 16/27] x86/pmu: Snapshot CPUID.0xA PMU capabilities during BSP initialization
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (14 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 15/27] x86/pmu: Snapshot PMU perf_capabilities during BSP initialization Sean Christopherson
@ 2022-11-02 22:50 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 17/27] x86/pmu: Drop wrappers that just passthrough pmu_caps fields Sean Christopherson
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

Snapshot PMU info from CPUID.0xA into "struct pmu_caps pmu" during
pmu_init() instead of reading CPUID.0xA every time a test wants to query
PMU capabilities.  Using pmu_caps to track various properties will also
make it easier to hide the differences between AMD and Intel PMUs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.c | 16 ++++++++++++++++
 lib/x86/pmu.h | 32 ++++++++++++++------------------
 2 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index bb272ab7..9c1034aa 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -4,6 +4,22 @@ struct pmu_caps pmu;
 
 void pmu_init(void)
 {
+	struct cpuid cpuid_10 = cpuid(10);
+
+	pmu.version = cpuid_10.a & 0xff;
+
+	if (pmu.version > 1) {
+		pmu.nr_fixed_counters = cpuid_10.d & 0x1f;
+		pmu.fixed_counter_width = (cpuid_10.d >> 5) & 0xff;
+	}
+
+	pmu.nr_gp_counters = (cpuid_10.a >> 8) & 0xff;
+	pmu.gp_counter_width = (cpuid_10.a >> 16) & 0xff;
+	pmu.gp_counter_mask_length = (cpuid_10.a >> 24) & 0xff;
+
+	/* CPUID.0xA.EBX bit is '1' if a counter is NOT available. */
+	pmu.gp_counter_available = ~cpuid_10.b;
+
 	if (this_cpu_has(X86_FEATURE_PDCM))
 		pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
 }
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index 4780237c..c7e9d3ae 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -34,6 +34,13 @@
 #define EVNTSEL_INV	(1 << EVNTSEL_INV_SHIF)
 
 struct pmu_caps {
+	u8 version;
+	u8 nr_fixed_counters;
+	u8 fixed_counter_width;
+	u8 nr_gp_counters;
+	u8 gp_counter_width;
+	u8 gp_counter_mask_length;
+	u32 gp_counter_available;
 	u64 perf_cap;
 };
 
@@ -43,7 +50,7 @@ void pmu_init(void);
 
 static inline u8 pmu_version(void)
 {
-	return cpuid(10).a & 0xff;
+	return pmu.version;
 }
 
 static inline bool this_cpu_has_pmu(void)
@@ -58,43 +65,32 @@ static inline bool this_cpu_has_perf_global_ctrl(void)
 
 static inline u8 pmu_nr_gp_counters(void)
 {
-	return (cpuid(10).a >> 8) & 0xff;
+	return pmu.nr_gp_counters;
 }
 
 static inline u8 pmu_gp_counter_width(void)
 {
-	return (cpuid(10).a >> 16) & 0xff;
+	return pmu.gp_counter_width;
 }
 
 static inline u8 pmu_gp_counter_mask_length(void)
 {
-	return (cpuid(10).a >> 24) & 0xff;
+	return pmu.gp_counter_mask_length;
 }
 
 static inline u8 pmu_nr_fixed_counters(void)
 {
-	struct cpuid id = cpuid(10);
-
-	if ((id.a & 0xff) > 1)
-		return id.d & 0x1f;
-	else
-		return 0;
+	return pmu.nr_fixed_counters;
 }
 
 static inline u8 pmu_fixed_counter_width(void)
 {
-	struct cpuid id = cpuid(10);
-
-	if ((id.a & 0xff) > 1)
-		return (id.d >> 5) & 0xff;
-	else
-		return 0;
+	return pmu.fixed_counter_width;
 }
 
 static inline bool pmu_gp_counter_is_available(int i)
 {
-	/* CPUID.0xA.EBX bit is '1 if they counter is NOT available. */
-	return !(cpuid(10).b & BIT(i));
+	return pmu.gp_counter_available & BIT(i);
 }
 
 static inline u64 this_cpu_perf_capabilities(void)
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 17/27] x86/pmu: Drop wrappers that just passthrough pmu_caps fields
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (15 preceding siblings ...)
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 16/27] x86/pmu: Snapshot CPUID.0xA PMU capabilities " Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 18/27] x86/pmu: Track GP counter and event select base MSRs in pmu_caps Sean Christopherson
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

Drop wrappers that are and always will be pure passthroughs of pmu_caps
fields, e.g. the number of fixed/general_purpose counters can always be
determined during PMU initialization and doesn't need runtime logic.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.h | 43 ++++--------------------------------
 x86/pmu.c     | 60 +++++++++++++++++++++------------------------------
 x86/pmu_lbr.c |  2 +-
 3 files changed, 30 insertions(+), 75 deletions(-)

diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index c7e9d3ae..f6abe1a6 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -48,44 +48,14 @@ extern struct pmu_caps pmu;
 
 void pmu_init(void);
 
-static inline u8 pmu_version(void)
-{
-	return pmu.version;
-}
-
 static inline bool this_cpu_has_pmu(void)
 {
-	return !!pmu_version();
+	return !!pmu.version;
 }
 
 static inline bool this_cpu_has_perf_global_ctrl(void)
 {
-	return pmu_version() > 1;
-}
-
-static inline u8 pmu_nr_gp_counters(void)
-{
-	return pmu.nr_gp_counters;
-}
-
-static inline u8 pmu_gp_counter_width(void)
-{
-	return pmu.gp_counter_width;
-}
-
-static inline u8 pmu_gp_counter_mask_length(void)
-{
-	return pmu.gp_counter_mask_length;
-}
-
-static inline u8 pmu_nr_fixed_counters(void)
-{
-	return pmu.nr_fixed_counters;
-}
-
-static inline u8 pmu_fixed_counter_width(void)
-{
-	return pmu.fixed_counter_width;
+	return pmu.version > 1;
 }
 
 static inline bool pmu_gp_counter_is_available(int i)
@@ -93,19 +63,14 @@ static inline bool pmu_gp_counter_is_available(int i)
 	return pmu.gp_counter_available & BIT(i);
 }
 
-static inline u64 this_cpu_perf_capabilities(void)
-{
-	return pmu.perf_cap;
-}
-
 static inline u64 pmu_lbr_version(void)
 {
-	return this_cpu_perf_capabilities() & PMU_CAP_LBR_FMT;
+	return pmu.perf_cap & PMU_CAP_LBR_FMT;
 }
 
 static inline bool pmu_has_full_writes(void)
 {
-	return this_cpu_perf_capabilities() & PMU_CAP_FW_WRITES;
+	return pmu.perf_cap & PMU_CAP_FW_WRITES;
 }
 
 #endif /* _X86_PMU_H_ */
diff --git a/x86/pmu.c b/x86/pmu.c
index 627fd394..d13291fe 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -196,14 +196,13 @@ static bool verify_counter(pmu_counter_t *cnt)
 
 static void check_gp_counter(struct pmu_event *evt)
 {
-	int nr_gp_counters = pmu_nr_gp_counters();
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | evt->unit_sel,
 	};
 	int i;
 
-	for (i = 0; i < nr_gp_counters; i++, cnt.ctr++) {
+	for (i = 0; i < pmu.nr_gp_counters; i++, cnt.ctr++) {
 		measure_one(&cnt);
 		report(verify_event(cnt.count, evt), "%s-%d", evt->name, i);
 	}
@@ -223,13 +222,12 @@ static void check_gp_counters(void)
 
 static void check_fixed_counters(void)
 {
-	int nr_fixed_counters = pmu_nr_fixed_counters();
 	pmu_counter_t cnt = {
 		.config = EVNTSEL_OS | EVNTSEL_USR,
 	};
 	int i;
 
-	for (i = 0; i < nr_fixed_counters; i++) {
+	for (i = 0; i < pmu.nr_fixed_counters; i++) {
 		cnt.ctr = fixed_events[i].unit_sel;
 		measure_one(&cnt);
 		report(verify_event(cnt.count, &fixed_events[i]), "fixed-%d", i);
@@ -238,12 +236,10 @@ static void check_fixed_counters(void)
 
 static void check_counters_many(void)
 {
-	int nr_fixed_counters = pmu_nr_fixed_counters();
-	int nr_gp_counters = pmu_nr_gp_counters();
 	pmu_counter_t cnt[10];
 	int i, n;
 
-	for (i = 0, n = 0; n < nr_gp_counters; i++) {
+	for (i = 0, n = 0; n < pmu.nr_gp_counters; i++) {
 		if (!pmu_gp_counter_is_available(i))
 			continue;
 
@@ -252,7 +248,7 @@ static void check_counters_many(void)
 			gp_events[i % ARRAY_SIZE(gp_events)].unit_sel;
 		n++;
 	}
-	for (i = 0; i < nr_fixed_counters; i++) {
+	for (i = 0; i < pmu.nr_fixed_counters; i++) {
 		cnt[n].ctr = fixed_events[i].unit_sel;
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR;
 		n++;
@@ -283,7 +279,6 @@ static uint64_t measure_for_overflow(pmu_counter_t *cnt)
 
 static void check_counter_overflow(void)
 {
-	int nr_gp_counters = pmu_nr_gp_counters();
 	uint64_t overflow_preset;
 	int i;
 	pmu_counter_t cnt = {
@@ -297,18 +292,18 @@ static void check_counter_overflow(void)
 
 	report_prefix_push("overflow");
 
-	for (i = 0; i < nr_gp_counters + 1; i++, cnt.ctr++) {
+	for (i = 0; i < pmu.nr_gp_counters + 1; i++, cnt.ctr++) {
 		uint64_t status;
 		int idx;
 
 		cnt.count = overflow_preset;
 		if (gp_counter_base == MSR_IA32_PMC0)
-			cnt.count &= (1ull << pmu_gp_counter_width()) - 1;
+			cnt.count &= (1ull << pmu.gp_counter_width) - 1;
 
-		if (i == nr_gp_counters) {
+		if (i == pmu.nr_gp_counters) {
 			cnt.ctr = fixed_events[0].unit_sel;
 			cnt.count = measure_for_overflow(&cnt);
-			cnt.count &= (1ull << pmu_fixed_counter_width()) - 1;
+			cnt.count &= (1ull << pmu.fixed_counter_width) - 1;
 		}
 
 		if (i % 2)
@@ -354,17 +349,13 @@ static void do_rdpmc_fast(void *ptr)
 
 static void check_rdpmc(void)
 {
-	int fixed_counter_width = pmu_fixed_counter_width();
-	int nr_fixed_counters = pmu_nr_fixed_counters();
-	u8 gp_counter_width = pmu_gp_counter_width();
-	int nr_gp_counters = pmu_nr_gp_counters();
 	uint64_t val = 0xff0123456789ull;
 	bool exc;
 	int i;
 
 	report_prefix_push("rdpmc");
 
-	for (i = 0; i < nr_gp_counters; i++) {
+	for (i = 0; i < pmu.nr_gp_counters; i++) {
 		uint64_t x;
 		pmu_counter_t cnt = {
 			.ctr = gp_counter_base + i,
@@ -381,7 +372,7 @@ static void check_rdpmc(void)
 			x = (uint64_t)(int64_t)val;
 
 		/* Mask according to the number of supported bits */
-		x &= (1ull << gp_counter_width) - 1;
+		x &= (1ull << pmu.gp_counter_width) - 1;
 
 		wrmsr(gp_counter_base + i, val);
 		report(rdpmc(i) == x, "cntr-%d", i);
@@ -392,8 +383,8 @@ static void check_rdpmc(void)
 		else
 			report(cnt.count == (u32)val, "fast-%d", i);
 	}
-	for (i = 0; i < nr_fixed_counters; i++) {
-		uint64_t x = val & ((1ull << fixed_counter_width) - 1);
+	for (i = 0; i < pmu.nr_fixed_counters; i++) {
+		uint64_t x = val & ((1ull << pmu.fixed_counter_width) - 1);
 		pmu_counter_t cnt = {
 			.ctr = MSR_CORE_PERF_FIXED_CTR0 + i,
 			.idx = i
@@ -437,7 +428,7 @@ static void check_running_counter_wrmsr(void)
 
 	count = -1;
 	if (gp_counter_base == MSR_IA32_PMC0)
-		count &= (1ull << pmu_gp_counter_width()) - 1;
+		count &= (1ull << pmu.gp_counter_width) - 1;
 
 	wrmsr(gp_counter_base, count);
 
@@ -541,15 +532,14 @@ static void check_gp_counters_write_width(void)
 {
 	u64 val_64 = 0xffffff0123456789ull;
 	u64 val_32 = val_64 & ((1ull << 32) - 1);
-	u64 val_max_width = val_64 & ((1ull << pmu_gp_counter_width()) - 1);
-	int nr_gp_counters = pmu_nr_gp_counters();
+	u64 val_max_width = val_64 & ((1ull << pmu.gp_counter_width) - 1);
 	int i;
 
 	/*
 	 * MSR_IA32_PERFCTRn supports 64-bit writes,
 	 * but only the lowest 32 bits are valid.
 	 */
-	for (i = 0; i < nr_gp_counters; i++) {
+	for (i = 0; i < pmu.nr_gp_counters; i++) {
 		wrmsr(MSR_IA32_PERFCTR0 + i, val_32);
 		assert(rdmsr(MSR_IA32_PERFCTR0 + i) == val_32);
 		assert(rdmsr(MSR_IA32_PMC0 + i) == val_32);
@@ -567,7 +557,7 @@ static void check_gp_counters_write_width(void)
 	 * MSR_IA32_PMCn supports writing values up to GP counter width,
 	 * and only the lowest bits of GP counter width are valid.
 	 */
-	for (i = 0; i < nr_gp_counters; i++) {
+	for (i = 0; i < pmu.nr_gp_counters; i++) {
 		wrmsr(MSR_IA32_PMC0 + i, val_32);
 		assert(rdmsr(MSR_IA32_PMC0 + i) == val_32);
 		assert(rdmsr(MSR_IA32_PERFCTR0 + i) == val_32);
@@ -597,7 +587,7 @@ static void set_ref_cycle_expectations(void)
 	uint64_t t0, t1, t2, t3;
 
 	/* Bit 2 enumerates the availability of reference cycles events. */
-	if (!pmu_nr_gp_counters() || !pmu_gp_counter_is_available(2))
+	if (!pmu.nr_gp_counters || !pmu_gp_counter_is_available(2))
 		return;
 
 	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
@@ -646,24 +636,24 @@ int main(int ac, char **av)
 
 	check_invalid_rdpmc_gp();
 
-	if (!pmu_version()) {
+	if (!pmu.version) {
 		report_skip("No Intel Arch PMU is detected!");
 		return report_summary();
 	}
 
-	if (pmu_version() == 1) {
+	if (pmu.version == 1) {
 		report_skip("PMU version 1 is not supported.");
 		return report_summary();
 	}
 
 	set_ref_cycle_expectations();
 
-	printf("PMU version:         %d\n", pmu_version());
-	printf("GP counters:         %d\n", pmu_nr_gp_counters());
-	printf("GP counter width:    %d\n", pmu_gp_counter_width());
-	printf("Mask length:         %d\n", pmu_gp_counter_mask_length());
-	printf("Fixed counters:      %d\n", pmu_nr_fixed_counters());
-	printf("Fixed counter width: %d\n", pmu_fixed_counter_width());
+	printf("PMU version:         %d\n", pmu.version);
+	printf("GP counters:         %d\n", pmu.nr_gp_counters);
+	printf("GP counter width:    %d\n", pmu.gp_counter_width);
+	printf("Mask length:         %d\n", pmu.gp_counter_mask_length);
+	printf("Fixed counters:      %d\n", pmu.nr_fixed_counters);
+	printf("Fixed counter width: %d\n", pmu.fixed_counter_width);
 
 	apic_write(APIC_LVTPC, PMI_VECTOR);
 
diff --git a/x86/pmu_lbr.c b/x86/pmu_lbr.c
index d0135520..36c9a8fa 100644
--- a/x86/pmu_lbr.c
+++ b/x86/pmu_lbr.c
@@ -67,7 +67,7 @@ int main(int ac, char **av)
 		return report_summary();
 	}
 
-	printf("PMU version:		 %d\n", pmu_version());
+	printf("PMU version:		 %d\n", pmu.version);
 	printf("LBR version:		 %ld\n", pmu_lbr_version());
 
 	/* Look for LBR from and to MSRs */
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 18/27] x86/pmu: Track GP counter and event select base MSRs in pmu_caps
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (16 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 17/27] x86/pmu: Drop wrappers that just passthrough pmu_caps fields Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 19/27] x86/pmu: Add helper to get fixed counter MSR index Sean Christopherson
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Snapshot the base MSRs for GP counters and event selects during pmu_init()
so that tests don't need to manually compute the bases.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
[sean: rename helpers to look more like macros, drop wrmsr wrappers]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.c |  2 ++
 lib/x86/pmu.h | 18 +++++++++++++++
 x86/pmu.c     | 63 ++++++++++++++++++++++++++-------------------------
 3 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index 9c1034aa..c73f802a 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -22,4 +22,6 @@ void pmu_init(void)
 
 	if (this_cpu_has(X86_FEATURE_PDCM))
 		pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
+	pmu.msr_gp_counter_base = MSR_IA32_PERFCTR0;
+	pmu.msr_gp_event_select_base = MSR_P6_EVNTSEL0;
 }
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index f6abe1a6..c98c583c 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -41,6 +41,9 @@ struct pmu_caps {
 	u8 gp_counter_width;
 	u8 gp_counter_mask_length;
 	u32 gp_counter_available;
+	u32 msr_gp_counter_base;
+	u32 msr_gp_event_select_base;
+
 	u64 perf_cap;
 };
 
@@ -48,6 +51,16 @@ extern struct pmu_caps pmu;
 
 void pmu_init(void);
 
+static inline u32 MSR_GP_COUNTERx(unsigned int i)
+{
+	return pmu.msr_gp_counter_base + i;
+}
+
+static inline u32 MSR_GP_EVENT_SELECTx(unsigned int i)
+{
+	return pmu.msr_gp_event_select_base + i;
+}
+
 static inline bool this_cpu_has_pmu(void)
 {
 	return !!pmu.version;
@@ -73,4 +86,9 @@ static inline bool pmu_has_full_writes(void)
 	return pmu.perf_cap & PMU_CAP_FW_WRITES;
 }
 
+static inline bool pmu_use_full_writes(void)
+{
+	return pmu.msr_gp_counter_base == MSR_IA32_PMC0;
+}
+
 #endif /* _X86_PMU_H_ */
diff --git a/x86/pmu.c b/x86/pmu.c
index d13291fe..d66786be 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -44,8 +44,6 @@ struct pmu_event {
 	{"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
 };
 
-static u64 gp_counter_base = MSR_IA32_PERFCTR0;
-
 char *buf;
 
 static inline void loop(void)
@@ -84,7 +82,7 @@ static bool is_gp(pmu_counter_t *evt)
 
 static int event_to_global_idx(pmu_counter_t *cnt)
 {
-	return cnt->ctr - (is_gp(cnt) ? gp_counter_base :
+	return cnt->ctr - (is_gp(cnt) ? pmu.msr_gp_counter_base :
 		(MSR_CORE_PERF_FIXED_CTR0 - FIXED_CNT_INDEX));
 }
 
@@ -120,10 +118,10 @@ static void __start_event(pmu_counter_t *evt, uint64_t count)
 {
     evt->count = count;
     wrmsr(evt->ctr, evt->count);
-    if (is_gp(evt))
-	    wrmsr(MSR_P6_EVNTSEL0 + event_to_global_idx(evt),
-			    evt->config | EVNTSEL_EN);
-    else {
+    if (is_gp(evt)) {
+	    wrmsr(MSR_GP_EVENT_SELECTx(event_to_global_idx(evt)),
+		  evt->config | EVNTSEL_EN);
+    } else {
 	    uint32_t ctrl = rdmsr(MSR_CORE_PERF_FIXED_CTR_CTRL);
 	    int shift = (evt->ctr - MSR_CORE_PERF_FIXED_CTR0) * 4;
 	    uint32_t usrospmi = 0;
@@ -149,10 +147,10 @@ static void start_event(pmu_counter_t *evt)
 static void stop_event(pmu_counter_t *evt)
 {
 	global_disable(evt);
-	if (is_gp(evt))
-		wrmsr(MSR_P6_EVNTSEL0 + event_to_global_idx(evt),
-				evt->config & ~EVNTSEL_EN);
-	else {
+	if (is_gp(evt)) {
+		wrmsr(MSR_GP_EVENT_SELECTx(event_to_global_idx(evt)),
+		      evt->config & ~EVNTSEL_EN);
+	} else {
 		uint32_t ctrl = rdmsr(MSR_CORE_PERF_FIXED_CTR_CTRL);
 		int shift = (evt->ctr - MSR_CORE_PERF_FIXED_CTR0) * 4;
 		wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, ctrl & ~(0xf << shift));
@@ -197,12 +195,12 @@ static bool verify_counter(pmu_counter_t *cnt)
 static void check_gp_counter(struct pmu_event *evt)
 {
 	pmu_counter_t cnt = {
-		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | evt->unit_sel,
 	};
 	int i;
 
-	for (i = 0; i < pmu.nr_gp_counters; i++, cnt.ctr++) {
+	for (i = 0; i < pmu.nr_gp_counters; i++) {
+		cnt.ctr = MSR_GP_COUNTERx(i);
 		measure_one(&cnt);
 		report(verify_event(cnt.count, evt), "%s-%d", evt->name, i);
 	}
@@ -243,7 +241,7 @@ static void check_counters_many(void)
 		if (!pmu_gp_counter_is_available(i))
 			continue;
 
-		cnt[n].ctr = gp_counter_base + n;
+		cnt[n].ctr = MSR_GP_COUNTERx(n);
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR |
 			gp_events[i % ARRAY_SIZE(gp_events)].unit_sel;
 		n++;
@@ -282,7 +280,7 @@ static void check_counter_overflow(void)
 	uint64_t overflow_preset;
 	int i;
 	pmu_counter_t cnt = {
-		.ctr = gp_counter_base,
+		.ctr = MSR_GP_COUNTERx(0),
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
 	};
 	overflow_preset = measure_for_overflow(&cnt);
@@ -292,18 +290,20 @@ static void check_counter_overflow(void)
 
 	report_prefix_push("overflow");
 
-	for (i = 0; i < pmu.nr_gp_counters + 1; i++, cnt.ctr++) {
+	for (i = 0; i < pmu.nr_gp_counters + 1; i++) {
 		uint64_t status;
 		int idx;
 
 		cnt.count = overflow_preset;
-		if (gp_counter_base == MSR_IA32_PMC0)
+		if (pmu_use_full_writes())
 			cnt.count &= (1ull << pmu.gp_counter_width) - 1;
 
 		if (i == pmu.nr_gp_counters) {
 			cnt.ctr = fixed_events[0].unit_sel;
 			cnt.count = measure_for_overflow(&cnt);
-			cnt.count &= (1ull << pmu.fixed_counter_width) - 1;
+			cnt.count &= (1ull << pmu.gp_counter_width) - 1;
+		} else {
+			cnt.ctr = MSR_GP_COUNTERx(i);
 		}
 
 		if (i % 2)
@@ -327,7 +327,7 @@ static void check_counter_overflow(void)
 static void check_gp_counter_cmask(void)
 {
 	pmu_counter_t cnt = {
-		.ctr = gp_counter_base,
+		.ctr = MSR_GP_COUNTERx(0),
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
 	};
 	cnt.config |= (0x2 << EVNTSEL_CMASK_SHIFT);
@@ -358,7 +358,7 @@ static void check_rdpmc(void)
 	for (i = 0; i < pmu.nr_gp_counters; i++) {
 		uint64_t x;
 		pmu_counter_t cnt = {
-			.ctr = gp_counter_base + i,
+			.ctr = MSR_GP_COUNTERx(i),
 			.idx = i
 		};
 
@@ -366,7 +366,7 @@ static void check_rdpmc(void)
 	         * Without full-width writes, only the low 32 bits are writable,
 	         * and the value is sign-extended.
 	         */
-		if (gp_counter_base == MSR_IA32_PERFCTR0)
+		if (pmu.msr_gp_counter_base == MSR_IA32_PERFCTR0)
 			x = (uint64_t)(int64_t)(int32_t)val;
 		else
 			x = (uint64_t)(int64_t)val;
@@ -374,7 +374,7 @@ static void check_rdpmc(void)
 		/* Mask according to the number of supported bits */
 		x &= (1ull << pmu.gp_counter_width) - 1;
 
-		wrmsr(gp_counter_base + i, val);
+		wrmsr(MSR_GP_COUNTERx(i), val);
 		report(rdpmc(i) == x, "cntr-%d", i);
 
 		exc = test_for_exception(GP_VECTOR, do_rdpmc_fast, &cnt);
@@ -408,7 +408,7 @@ static void check_running_counter_wrmsr(void)
 	uint64_t status;
 	uint64_t count;
 	pmu_counter_t evt = {
-		.ctr = gp_counter_base,
+		.ctr = MSR_GP_COUNTERx(0),
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
 	};
 
@@ -416,7 +416,7 @@ static void check_running_counter_wrmsr(void)
 
 	start_event(&evt);
 	loop();
-	wrmsr(gp_counter_base, 0);
+	wrmsr(MSR_GP_COUNTERx(0), 0);
 	stop_event(&evt);
 	report(evt.count < gp_events[1].min, "cntr");
 
@@ -427,10 +427,10 @@ static void check_running_counter_wrmsr(void)
 	start_event(&evt);
 
 	count = -1;
-	if (gp_counter_base == MSR_IA32_PMC0)
+	if (pmu_use_full_writes())
 		count &= (1ull << pmu.gp_counter_width) - 1;
 
-	wrmsr(gp_counter_base, count);
+	wrmsr(MSR_GP_COUNTERx(0), count);
 
 	loop();
 	stop_event(&evt);
@@ -444,12 +444,12 @@ static void check_emulated_instr(void)
 {
 	uint64_t status, instr_start, brnch_start;
 	pmu_counter_t brnch_cnt = {
-		.ctr = MSR_IA32_PERFCTR0,
+		.ctr = MSR_GP_COUNTERx(0),
 		/* branch instructions */
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[5].unit_sel,
 	};
 	pmu_counter_t instr_cnt = {
-		.ctr = MSR_IA32_PERFCTR0 + 1,
+		.ctr = MSR_GP_COUNTERx(1),
 		/* instructions */
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
 	};
@@ -463,8 +463,8 @@ static void check_emulated_instr(void)
 
 	brnch_start = -EXPECTED_BRNCH;
 	instr_start = -EXPECTED_INSTR;
-	wrmsr(MSR_IA32_PERFCTR0, brnch_start);
-	wrmsr(MSR_IA32_PERFCTR0 + 1, instr_start);
+	wrmsr(MSR_GP_COUNTERx(0), brnch_start);
+	wrmsr(MSR_GP_COUNTERx(1), instr_start);
 	// KVM_FEP is a magic prefix that forces emulation so
 	// 'KVM_FEP "jne label\n"' just counts as a single instruction.
 	asm volatile(
@@ -660,7 +660,8 @@ int main(int ac, char **av)
 	check_counters();
 
 	if (pmu_has_full_writes()) {
-		gp_counter_base = MSR_IA32_PMC0;
+		pmu.msr_gp_counter_base = MSR_IA32_PMC0;
+
 		report_prefix_push("full-width writes");
 		check_counters();
 		check_gp_counters_write_width();
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 19/27] x86/pmu: Add helper to get fixed counter MSR index
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (17 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 18/27] x86/pmu: Track GP counter and event select base MSRs in pmu_caps Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 20/27] x86/pmu: Reset GP and Fixed counters during pmu_init() Sean Christopherson
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Add a helper to get the index of a fixed counter instead of manually
calculating the index, a future patch will add more users of the fixed
counter MSRs.

No functional change intended.

Signed-off-by: Like Xu <likexu@tencent.com>
[sean: move to separate patch, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.h | 5 +++++
 x86/pmu.c     | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index c98c583c..091e61b3 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -91,4 +91,9 @@ static inline bool pmu_use_full_writes(void)
 	return pmu.msr_gp_counter_base == MSR_IA32_PMC0;
 }
 
+static inline u32 MSR_PERF_FIXED_CTRx(unsigned int i)
+{
+	return MSR_CORE_PERF_FIXED_CTR0 + i;
+}
+
 #endif /* _X86_PMU_H_ */
diff --git a/x86/pmu.c b/x86/pmu.c
index d66786be..eb83c407 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -390,7 +390,7 @@ static void check_rdpmc(void)
 			.idx = i
 		};
 
-		wrmsr(MSR_CORE_PERF_FIXED_CTR0 + i, x);
+		wrmsr(MSR_PERF_FIXED_CTRx(i), x);
 		report(rdpmc(i | (1 << 30)) == x, "fixed cntr-%d", i);
 
 		exc = test_for_exception(GP_VECTOR, do_rdpmc_fast, &cnt);
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 20/27] x86/pmu: Reset GP and Fixed counters during pmu_init().
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (18 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 19/27] x86/pmu: Add helper to get fixed counter MSR index Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 21/27] x86/pmu: Track global status/control/clear MSRs in pmu_caps Sean Christopherson
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

In generic PMU testing, it is very common to initialize the test env by
resetting counters registers. Add helpers to reset all PMU counters for
code reusability, and reset all counters during PMU initialization for
good measure.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.c |  2 ++
 lib/x86/pmu.h | 28 ++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index c73f802a..fb9a121e 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -24,4 +24,6 @@ void pmu_init(void)
 		pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
 	pmu.msr_gp_counter_base = MSR_IA32_PERFCTR0;
 	pmu.msr_gp_event_select_base = MSR_P6_EVNTSEL0;
+
+	pmu_reset_all_counters();
 }
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index 091e61b3..cd81f557 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -96,4 +96,32 @@ static inline u32 MSR_PERF_FIXED_CTRx(unsigned int i)
 	return MSR_CORE_PERF_FIXED_CTR0 + i;
 }
 
+static inline void pmu_reset_all_gp_counters(void)
+{
+	unsigned int idx;
+
+	for (idx = 0; idx < pmu.nr_gp_counters; idx++) {
+		wrmsr(MSR_GP_EVENT_SELECTx(idx), 0);
+		wrmsr(MSR_GP_COUNTERx(idx), 0);
+	}
+}
+
+static inline void pmu_reset_all_fixed_counters(void)
+{
+	unsigned int idx;
+
+	if (!pmu.nr_fixed_counters)
+		return;
+
+	wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
+	for (idx = 0; idx < pmu.nr_fixed_counters; idx++)
+		wrmsr(MSR_PERF_FIXED_CTRx(idx), 0);
+}
+
+static inline void pmu_reset_all_counters(void)
+{
+	pmu_reset_all_gp_counters();
+	pmu_reset_all_fixed_counters();
+}
+
 #endif /* _X86_PMU_H_ */
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 21/27] x86/pmu: Track global status/control/clear MSRs in pmu_caps
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (19 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 20/27] x86/pmu: Reset GP and Fixed counters during pmu_init() Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 22/27] x86: Add tests for Guest Processor Event Based Sampling (PEBS) Sean Christopherson
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Track the global PMU MSRs in pmu_caps so that tests don't need to manually
differntiate between AMD and Intel.  Although AMD and Intel PMUs have the
same semantics in terms of global control features (including ctl and
status), their MSR indexes are not the same

Signed-off-by: Like Xu <likexu@tencent.com>
[sean: drop most getters/setters]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.c |  3 +++
 lib/x86/pmu.h |  9 +++++++++
 x86/pmu.c     | 31 +++++++++++++------------------
 3 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index fb9a121e..0a69a3c6 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -24,6 +24,9 @@ void pmu_init(void)
 		pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
 	pmu.msr_gp_counter_base = MSR_IA32_PERFCTR0;
 	pmu.msr_gp_event_select_base = MSR_P6_EVNTSEL0;
+	pmu.msr_global_status = MSR_CORE_PERF_GLOBAL_STATUS;
+	pmu.msr_global_ctl = MSR_CORE_PERF_GLOBAL_CTRL;
+	pmu.msr_global_status_clr = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
 
 	pmu_reset_all_counters();
 }
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index cd81f557..cc643a7f 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -44,6 +44,10 @@ struct pmu_caps {
 	u32 msr_gp_counter_base;
 	u32 msr_gp_event_select_base;
 
+	u32 msr_global_status;
+	u32 msr_global_ctl;
+	u32 msr_global_status_clr;
+
 	u64 perf_cap;
 };
 
@@ -124,4 +128,9 @@ static inline void pmu_reset_all_counters(void)
 	pmu_reset_all_fixed_counters();
 }
 
+static inline void pmu_clear_global_status(void)
+{
+	wrmsr(pmu.msr_global_status_clr, rdmsr(pmu.msr_global_status));
+}
+
 #endif /* _X86_PMU_H_ */
diff --git a/x86/pmu.c b/x86/pmu.c
index eb83c407..3cca5b9c 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -103,15 +103,12 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
 static void global_enable(pmu_counter_t *cnt)
 {
 	cnt->idx = event_to_global_idx(cnt);
-
-	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) |
-			(1ull << cnt->idx));
+	wrmsr(pmu.msr_global_ctl, rdmsr(pmu.msr_global_ctl) | BIT_ULL(cnt->idx));
 }
 
 static void global_disable(pmu_counter_t *cnt)
 {
-	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) &
-			~(1ull << cnt->idx));
+	wrmsr(pmu.msr_global_ctl, rdmsr(pmu.msr_global_ctl) & ~BIT_ULL(cnt->idx));
 }
 
 static void __start_event(pmu_counter_t *evt, uint64_t count)
@@ -286,7 +283,7 @@ static void check_counter_overflow(void)
 	overflow_preset = measure_for_overflow(&cnt);
 
 	/* clear status before test */
-	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	pmu_clear_global_status();
 
 	report_prefix_push("overflow");
 
@@ -313,10 +310,10 @@ static void check_counter_overflow(void)
 		idx = event_to_global_idx(&cnt);
 		__measure(&cnt, cnt.count);
 		report(cnt.count == 1, "cntr-%d", i);
-		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
+		status = rdmsr(pmu.msr_global_status);
 		report(status & (1ull << idx), "status-%d", i);
-		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, status);
-		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
+		wrmsr(pmu.msr_global_status_clr, status);
+		status = rdmsr(pmu.msr_global_status);
 		report(!(status & (1ull << idx)), "status clear-%d", i);
 		report(check_irq() == (i % 2), "irq-%d", i);
 	}
@@ -421,8 +418,7 @@ static void check_running_counter_wrmsr(void)
 	report(evt.count < gp_events[1].min, "cntr");
 
 	/* clear status before overflow test */
-	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
-	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	pmu_clear_global_status();
 
 	start_event(&evt);
 
@@ -434,8 +430,8 @@ static void check_running_counter_wrmsr(void)
 
 	loop();
 	stop_event(&evt);
-	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
-	report(status & 1, "status");
+	status = rdmsr(pmu.msr_global_status);
+	report(status & 1, "status msr bit");
 
 	report_prefix_pop();
 }
@@ -455,8 +451,7 @@ static void check_emulated_instr(void)
 	};
 	report_prefix_push("emulated instruction");
 
-	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
-	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	pmu_clear_global_status();
 
 	start_event(&brnch_cnt);
 	start_event(&instr_cnt);
@@ -490,7 +485,7 @@ static void check_emulated_instr(void)
 		:
 		: "eax", "ebx", "ecx", "edx");
 
-	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+	wrmsr(pmu.msr_global_ctl, 0);
 
 	stop_event(&brnch_cnt);
 	stop_event(&instr_cnt);
@@ -502,7 +497,7 @@ static void check_emulated_instr(void)
 	report(brnch_cnt.count - brnch_start >= EXPECTED_BRNCH,
 	       "branch count");
 	// Additionally check that those counters overflowed properly.
-	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
+	status = rdmsr(pmu.msr_global_status);
 	report(status & 1, "branch counter overflow");
 	report(status & 2, "instruction counter overflow");
 
@@ -590,7 +585,7 @@ static void set_ref_cycle_expectations(void)
 	if (!pmu.nr_gp_counters || !pmu_gp_counter_is_available(2))
 		return;
 
-	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+	wrmsr(pmu.msr_global_ctl, 0);
 
 	t0 = fenced_rdtsc();
 	start_event(&cnt);
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 22/27] x86: Add tests for Guest Processor Event Based Sampling (PEBS)
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (20 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 21/27] x86/pmu: Track global status/control/clear MSRs in pmu_caps Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 23/27] x86/pmu: Add global helpers to cover Intel Arch PMU Version 1 Sean Christopherson
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

This unit-test is intended to test the KVM's support for
the Processor Event Based Sampling (PEBS) which is another
PMU feature on Intel processors (start from Ice Lake Server).

If a bit in PEBS_ENABLE is set to 1, its corresponding counter will
write at least one PEBS records (including partial state of the vcpu
at the time of the current hardware event) to the guest memory on
counter overflow, and trigger an interrupt at a specific DS state.
The format of a PEBS record can be configured by another register.

These tests cover most usage scenarios, for example there are some
specially constructed scenarios (not a typical behaviour of Linux
PEBS driver). It lowers the threshold for others to understand this
feature and opens up more exploration of KVM implementation or
hw feature itself.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/msr.h       |   1 +
 lib/x86/pmu.h       |  34 ++++
 x86/Makefile.x86_64 |   1 +
 x86/pmu_pebs.c      | 433 ++++++++++++++++++++++++++++++++++++++++++++
 x86/unittests.cfg   |   8 +
 5 files changed, 477 insertions(+)
 create mode 100644 x86/pmu_pebs.c

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index bbe29fd9..68d88371 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -52,6 +52,7 @@
 #define MSR_IA32_MCG_CTL		0x0000017b
 
 #define MSR_IA32_PEBS_ENABLE		0x000003f1
+#define MSR_PEBS_DATA_CFG		0x000003f2
 #define MSR_IA32_DS_AREA		0x00000600
 #define MSR_IA32_PERF_CAPABILITIES	0x00000345
 
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index cc643a7f..885b53f1 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -14,6 +14,8 @@
 
 #define PMU_CAP_LBR_FMT	  0x3f
 #define PMU_CAP_FW_WRITES	(1ULL << 13)
+#define PMU_CAP_PEBS_BASELINE	(1ULL << 14)
+#define PERF_CAP_PEBS_FORMAT           0xf00
 
 #define EVNSEL_EVENT_SHIFT	0
 #define EVNTSEL_UMASK_SHIFT	8
@@ -33,6 +35,18 @@
 #define EVNTSEL_INT	(1 << EVNTSEL_INT_SHIFT)
 #define EVNTSEL_INV	(1 << EVNTSEL_INV_SHIF)
 
+#define GLOBAL_STATUS_BUFFER_OVF_BIT		62
+#define GLOBAL_STATUS_BUFFER_OVF	BIT_ULL(GLOBAL_STATUS_BUFFER_OVF_BIT)
+
+#define PEBS_DATACFG_MEMINFO	BIT_ULL(0)
+#define PEBS_DATACFG_GP	BIT_ULL(1)
+#define PEBS_DATACFG_XMMS	BIT_ULL(2)
+#define PEBS_DATACFG_LBRS	BIT_ULL(3)
+
+#define ICL_EVENTSEL_ADAPTIVE				(1ULL << 34)
+#define PEBS_DATACFG_LBR_SHIFT	24
+#define MAX_NUM_LBR_ENTRY	32
+
 struct pmu_caps {
 	u8 version;
 	u8 nr_fixed_counters;
@@ -90,6 +104,11 @@ static inline bool pmu_has_full_writes(void)
 	return pmu.perf_cap & PMU_CAP_FW_WRITES;
 }
 
+static inline void pmu_activate_full_writes(void)
+{
+	pmu.msr_gp_counter_base = MSR_IA32_PMC0;
+}
+
 static inline bool pmu_use_full_writes(void)
 {
 	return pmu.msr_gp_counter_base == MSR_IA32_PMC0;
@@ -133,4 +152,19 @@ static inline void pmu_clear_global_status(void)
 	wrmsr(pmu.msr_global_status_clr, rdmsr(pmu.msr_global_status));
 }
 
+static inline bool pmu_has_pebs(void)
+{
+	return pmu.version > 1;
+}
+
+static inline u8 pmu_pebs_format(void)
+{
+	return (pmu.perf_cap & PERF_CAP_PEBS_FORMAT ) >> 8;
+}
+
+static inline bool pmu_has_pebs_baseline(void)
+{
+	return pmu.perf_cap & PMU_CAP_PEBS_BASELINE;
+}
+
 #endif /* _X86_PMU_H_ */
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index 8f9463cd..bd827fe9 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -33,6 +33,7 @@ tests += $(TEST_DIR)/vmware_backdoors.$(exe)
 tests += $(TEST_DIR)/rdpru.$(exe)
 tests += $(TEST_DIR)/pks.$(exe)
 tests += $(TEST_DIR)/pmu_lbr.$(exe)
+tests += $(TEST_DIR)/pmu_pebs.$(exe)
 
 ifeq ($(CONFIG_EFI),y)
 tests += $(TEST_DIR)/amd_sev.$(exe)
diff --git a/x86/pmu_pebs.c b/x86/pmu_pebs.c
new file mode 100644
index 00000000..3b6bcb2c
--- /dev/null
+++ b/x86/pmu_pebs.c
@@ -0,0 +1,433 @@
+#include "x86/msr.h"
+#include "x86/processor.h"
+#include "x86/pmu.h"
+#include "x86/isr.h"
+#include "x86/apic.h"
+#include "x86/apic-defs.h"
+#include "x86/desc.h"
+#include "alloc.h"
+
+#include "vm.h"
+#include "types.h"
+#include "processor.h"
+#include "vmalloc.h"
+#include "alloc_page.h"
+
+/* bits [63:48] provides the size of the current record in bytes */
+#define	RECORD_SIZE_OFFSET	48
+
+static unsigned int max_nr_gp_events;
+static unsigned long *ds_bufer;
+static unsigned long *pebs_buffer;
+static u64 ctr_start_val;
+static bool has_baseline;
+
+struct debug_store {
+	u64	bts_buffer_base;
+	u64	bts_index;
+	u64	bts_absolute_maximum;
+	u64	bts_interrupt_threshold;
+	u64	pebs_buffer_base;
+	u64	pebs_index;
+	u64	pebs_absolute_maximum;
+	u64	pebs_interrupt_threshold;
+	u64	pebs_event_reset[64];
+};
+
+struct pebs_basic {
+	u64 format_size;
+	u64 ip;
+	u64 applicable_counters;
+	u64 tsc;
+};
+
+struct pebs_meminfo {
+	u64 address;
+	u64 aux;
+	u64 latency;
+	u64 tsx_tuning;
+};
+
+struct pebs_gprs {
+	u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di;
+	u64 r8, r9, r10, r11, r12, r13, r14, r15;
+};
+
+struct pebs_xmm {
+	u64 xmm[16*2];	/* two entries for each register */
+};
+
+struct lbr_entry {
+	u64 from;
+	u64 to;
+	u64 info;
+};
+
+enum pmc_type {
+	GP = 0,
+	FIXED,
+};
+
+static uint32_t intel_arch_events[] = {
+	0x00c4, /* PERF_COUNT_HW_BRANCH_INSTRUCTIONS */
+	0x00c5, /* PERF_COUNT_HW_BRANCH_MISSES */
+	0x0300, /* PERF_COUNT_HW_REF_CPU_CYCLES */
+	0x003c, /* PERF_COUNT_HW_CPU_CYCLES */
+	0x00c0, /* PERF_COUNT_HW_INSTRUCTIONS */
+	0x013c, /* PERF_COUNT_HW_BUS_CYCLES */
+	0x4f2e, /* PERF_COUNT_HW_CACHE_REFERENCES */
+	0x412e, /* PERF_COUNT_HW_CACHE_MISSES */
+};
+
+static u64 pebs_data_cfgs[] = {
+	PEBS_DATACFG_MEMINFO,
+	PEBS_DATACFG_GP,
+	PEBS_DATACFG_XMMS,
+	PEBS_DATACFG_LBRS | ((MAX_NUM_LBR_ENTRY -1) << PEBS_DATACFG_LBR_SHIFT),
+};
+
+/* Iterating each counter value is a waste of time, pick a few typical values. */
+static u64 counter_start_values[] = {
+	/* if PEBS counter doesn't overflow at all */
+	0,
+	0xfffffffffff0,
+	/* normal counter overflow to have PEBS records */
+	0xfffffffffffe,
+	/* test whether emulated instructions should trigger PEBS */
+	0xffffffffffff,
+};
+
+static unsigned int get_adaptive_pebs_record_size(u64 pebs_data_cfg)
+{
+	unsigned int sz = sizeof(struct pebs_basic);
+
+	if (!has_baseline)
+		return sz;
+
+	if (pebs_data_cfg & PEBS_DATACFG_MEMINFO)
+		sz += sizeof(struct pebs_meminfo);
+	if (pebs_data_cfg & PEBS_DATACFG_GP)
+		sz += sizeof(struct pebs_gprs);
+	if (pebs_data_cfg & PEBS_DATACFG_XMMS)
+		sz += sizeof(struct pebs_xmm);
+	if (pebs_data_cfg & PEBS_DATACFG_LBRS)
+		sz += MAX_NUM_LBR_ENTRY * sizeof(struct lbr_entry);
+
+	return sz;
+}
+
+static void cnt_overflow(isr_regs_t *regs)
+{
+	apic_write(APIC_EOI, 0);
+}
+
+static inline void workload(void)
+{
+	asm volatile(
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"label2:\n"
+		:
+		:
+		: "eax", "ebx", "ecx", "edx");
+}
+
+static inline void workload2(void)
+{
+	asm volatile(
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"label3:\n"
+		:
+		:
+		: "eax", "ebx", "ecx", "edx");
+}
+
+static void alloc_buffers(void)
+{
+	ds_bufer = alloc_page();
+	force_4k_page(ds_bufer);
+	memset(ds_bufer, 0x0, PAGE_SIZE);
+
+	pebs_buffer = alloc_page();
+	force_4k_page(pebs_buffer);
+	memset(pebs_buffer, 0x0, PAGE_SIZE);
+}
+
+static void free_buffers(void)
+{
+	if (ds_bufer)
+		free_page(ds_bufer);
+
+	if (pebs_buffer)
+		free_page(pebs_buffer);
+}
+
+static void pebs_enable(u64 bitmask, u64 pebs_data_cfg)
+{
+	static struct debug_store *ds;
+	u64 baseline_extra_ctrl = 0, fixed_ctr_ctrl = 0;
+	unsigned int idx;
+
+	if (has_baseline)
+		wrmsr(MSR_PEBS_DATA_CFG, pebs_data_cfg);
+
+	ds = (struct debug_store *)ds_bufer;
+	ds->pebs_index = ds->pebs_buffer_base = (unsigned long)pebs_buffer;
+	ds->pebs_absolute_maximum = (unsigned long)pebs_buffer + PAGE_SIZE;
+	ds->pebs_interrupt_threshold = ds->pebs_buffer_base +
+		get_adaptive_pebs_record_size(pebs_data_cfg);
+
+	for (idx = 0; idx < pmu.nr_fixed_counters; idx++) {
+		if (!(BIT_ULL(FIXED_CNT_INDEX + idx) & bitmask))
+			continue;
+		if (has_baseline)
+			baseline_extra_ctrl = BIT(FIXED_CNT_INDEX + idx * 4);
+		wrmsr(MSR_PERF_FIXED_CTRx(idx), ctr_start_val);
+		fixed_ctr_ctrl |= (0xbULL << (idx * 4) | baseline_extra_ctrl);
+	}
+	if (fixed_ctr_ctrl)
+		wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, fixed_ctr_ctrl);
+
+	for (idx = 0; idx < max_nr_gp_events; idx++) {
+		if (!(BIT_ULL(idx) & bitmask))
+			continue;
+		if (has_baseline)
+			baseline_extra_ctrl = ICL_EVENTSEL_ADAPTIVE;
+		wrmsr(MSR_GP_EVENT_SELECTx(idx), EVNTSEL_EN | EVNTSEL_OS | EVNTSEL_USR |
+						 intel_arch_events[idx] | baseline_extra_ctrl);
+		wrmsr(MSR_GP_COUNTERx(idx), ctr_start_val);
+	}
+
+	wrmsr(MSR_IA32_DS_AREA,  (unsigned long)ds_bufer);
+	wrmsr(MSR_IA32_PEBS_ENABLE, bitmask);
+	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, bitmask);
+}
+
+static void reset_pebs(void)
+{
+	memset(ds_bufer, 0x0, PAGE_SIZE);
+	memset(pebs_buffer, 0x0, PAGE_SIZE);
+	wrmsr(MSR_IA32_PEBS_ENABLE, 0);
+	wrmsr(MSR_IA32_DS_AREA,  0);
+	if (has_baseline)
+		wrmsr(MSR_PEBS_DATA_CFG, 0);
+
+	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+
+	pmu_reset_all_counters();
+}
+
+static void pebs_disable(unsigned int idx)
+{
+	/*
+	* If we only clear the PEBS_ENABLE bit, the counter will continue to increment.
+	* In this very tiny time window, if the counter overflows no pebs record will be generated,
+	* but a normal counter irq. Test this fully with two ways.
+	*/
+	if (idx % 2)
+		wrmsr(MSR_IA32_PEBS_ENABLE, 0);
+
+	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+}
+
+static void check_pebs_records(u64 bitmask, u64 pebs_data_cfg)
+{
+	struct pebs_basic *pebs_rec = (struct pebs_basic *)pebs_buffer;
+	struct debug_store *ds = (struct debug_store *)ds_bufer;
+	unsigned int pebs_record_size = get_adaptive_pebs_record_size(pebs_data_cfg);
+	unsigned int count = 0;
+	bool expected, pebs_idx_match, pebs_size_match, data_cfg_match;
+	void *cur_record;
+
+	expected = (ds->pebs_index == ds->pebs_buffer_base) && !pebs_rec->format_size;
+	if (!(rdmsr(MSR_CORE_PERF_GLOBAL_STATUS) & GLOBAL_STATUS_BUFFER_OVF)) {
+		report(expected, "No OVF irq, none PEBS records.");
+		return;
+	}
+
+	if (expected) {
+		report(!expected, "A OVF irq, but none PEBS records.");
+		return;
+	}
+
+	expected = ds->pebs_index >= ds->pebs_interrupt_threshold;
+	cur_record = (void *)pebs_buffer;
+	do {
+		pebs_rec = (struct pebs_basic *)cur_record;
+		pebs_record_size = pebs_rec->format_size >> RECORD_SIZE_OFFSET;
+		pebs_idx_match =
+			pebs_rec->applicable_counters & bitmask;
+		pebs_size_match =
+			pebs_record_size == get_adaptive_pebs_record_size(pebs_data_cfg);
+		data_cfg_match =
+			(pebs_rec->format_size & GENMASK_ULL(47, 0)) == pebs_data_cfg;
+		expected = pebs_idx_match && pebs_size_match && data_cfg_match;
+		report(expected,
+		       "PEBS record (written seq %d) is verified (inclduing size, counters and cfg).", count);
+		cur_record = cur_record + pebs_record_size;
+		count++;
+	} while (expected && (void *)cur_record < (void *)ds->pebs_index);
+
+	if (!expected) {
+		if (!pebs_idx_match)
+			printf("FAIL: The applicable_counters (0x%lx) doesn't match with pmc_bitmask (0x%lx).\n",
+			       pebs_rec->applicable_counters, bitmask);
+		if (!pebs_size_match)
+			printf("FAIL: The pebs_record_size (%d) doesn't match with MSR_PEBS_DATA_CFG (%d).\n",
+			       pebs_record_size, get_adaptive_pebs_record_size(pebs_data_cfg));
+		if (!data_cfg_match)
+			printf("FAIL: The pebs_data_cfg (0x%lx) doesn't match with MSR_PEBS_DATA_CFG (0x%lx).\n",
+			       pebs_rec->format_size & 0xffffffffffff, pebs_data_cfg);
+	}
+}
+
+static void check_one_counter(enum pmc_type type,
+			      unsigned int idx, u64 pebs_data_cfg)
+{
+	int pebs_bit = BIT_ULL(type == FIXED ? FIXED_CNT_INDEX + idx : idx);
+
+	report_prefix_pushf("%s counter %d (0x%lx)",
+			    type == FIXED ? "Extended Fixed" : "GP", idx, ctr_start_val);
+	reset_pebs();
+	pebs_enable(pebs_bit, pebs_data_cfg);
+	workload();
+	pebs_disable(idx);
+	check_pebs_records(pebs_bit, pebs_data_cfg);
+	report_prefix_pop();
+}
+
+/* more than one PEBS records will be generated. */
+static void check_multiple_counters(u64 bitmask, u64 pebs_data_cfg)
+{
+	reset_pebs();
+	pebs_enable(bitmask, pebs_data_cfg);
+	workload2();
+	pebs_disable(0);
+	check_pebs_records(bitmask, pebs_data_cfg);
+}
+
+static void check_pebs_counters(u64 pebs_data_cfg)
+{
+	unsigned int idx;
+	u64 bitmask = 0;
+
+	for (idx = 0; idx < pmu.nr_fixed_counters; idx++)
+		check_one_counter(FIXED, idx, pebs_data_cfg);
+
+	for (idx = 0; idx < max_nr_gp_events; idx++)
+		check_one_counter(GP, idx, pebs_data_cfg);
+
+	for (idx = 0; idx < pmu.nr_fixed_counters; idx++)
+		bitmask |= BIT_ULL(FIXED_CNT_INDEX + idx);
+	for (idx = 0; idx < max_nr_gp_events; idx += 2)
+		bitmask |= BIT_ULL(idx);
+	report_prefix_pushf("Multiple (0x%lx)", bitmask);
+	check_multiple_counters(bitmask, pebs_data_cfg);
+	report_prefix_pop();
+}
+
+/*
+ * Known reasons for none PEBS records:
+ *	1. The selected event does not support PEBS;
+ *	2. From a core pmu perspective, the vCPU and pCPU models are not same;
+ * 	3. Guest counter has not yet overflowed or been cross-mapped by the host;
+ */
+int main(int ac, char **av)
+{
+	unsigned int i, j;
+
+	setup_vm();
+
+	max_nr_gp_events = MIN(pmu.nr_gp_counters, ARRAY_SIZE(intel_arch_events));
+
+	printf("PMU version: %d\n", pmu.version);
+
+	has_baseline = pmu_has_pebs_baseline();
+	if (pmu_has_full_writes())
+		pmu_activate_full_writes();
+
+	if (!is_intel()) {
+		report_skip("PEBS requires Intel ICX or later, non-Intel detected");
+		return report_summary();
+	} else if (!pmu_has_pebs()) {
+		report_skip("PEBS required PMU version 2, reported version is %d", pmu.version);
+		return report_summary();
+	} else if (!pmu_pebs_format()) {
+		report_skip("PEBS not enumerated in PERF_CAPABILITIES");
+		return report_summary();
+	} else if (rdmsr(MSR_IA32_MISC_ENABLE) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL) {
+		report_skip("PEBS unavailable according to MISC_ENABLE");
+		return report_summary();
+	}
+
+	printf("PEBS format: %d\n", pmu_pebs_format());
+	printf("PEBS GP counters: %d\n", pmu.nr_gp_counters);
+	printf("PEBS Fixed counters: %d\n", pmu.nr_fixed_counters);
+	printf("PEBS baseline (Adaptive PEBS): %d\n", has_baseline);
+
+	handle_irq(PMI_VECTOR, cnt_overflow);
+	alloc_buffers();
+
+	for (i = 0; i < ARRAY_SIZE(counter_start_values); i++) {
+		ctr_start_val = counter_start_values[i];
+		check_pebs_counters(0);
+		if (!has_baseline)
+			continue;
+
+		for (j = 0; j < ARRAY_SIZE(pebs_data_cfgs); j++) {
+			report_prefix_pushf("Adaptive (0x%lx)", pebs_data_cfgs[j]);
+			check_pebs_counters(pebs_data_cfgs[j]);
+			report_prefix_pop();
+		}
+	}
+
+	free_buffers();
+
+	return report_summary();
+}
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 07d05070..54f04375 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -200,6 +200,14 @@ check = /proc/sys/kernel/nmi_watchdog=0
 accel = kvm
 groups = pmu
 
+[pmu_pebs]
+arch = x86_64
+file = pmu_pebs.flat
+extra_params = -cpu host,migratable=no
+check = /proc/sys/kernel/nmi_watchdog=0
+accel = kvm
+groups = pmu
+
 [vmware_backdoors]
 file = vmware_backdoors.flat
 extra_params = -machine vmport=on -cpu max
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 23/27] x86/pmu: Add global helpers to cover Intel Arch PMU Version 1
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (21 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 22/27] x86: Add tests for Guest Processor Event Based Sampling (PEBS) Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 24/27] x86/pmu: Add gp_events pointer to route different event tables Sean Christopherson
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

To test Intel arch pmu version 1, most of the basic framework and
use cases which test any PMU counter do not require any changes,
except no access to registers introduced only in PMU version 2.

Adding some guardian's checks can seamlessly support version 1,
while opening the door for normal AMD PMUs tests.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.c |  9 ++++++---
 lib/x86/pmu.h |  5 +++++
 x86/pmu.c     | 47 +++++++++++++++++++++++++++++++----------------
 3 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index 0a69a3c6..ea4859df 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -24,9 +24,12 @@ void pmu_init(void)
 		pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
 	pmu.msr_gp_counter_base = MSR_IA32_PERFCTR0;
 	pmu.msr_gp_event_select_base = MSR_P6_EVNTSEL0;
-	pmu.msr_global_status = MSR_CORE_PERF_GLOBAL_STATUS;
-	pmu.msr_global_ctl = MSR_CORE_PERF_GLOBAL_CTRL;
-	pmu.msr_global_status_clr = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
+
+	if (this_cpu_has_perf_global_status()) {
+		pmu.msr_global_status = MSR_CORE_PERF_GLOBAL_STATUS;
+		pmu.msr_global_ctl = MSR_CORE_PERF_GLOBAL_CTRL;
+		pmu.msr_global_status_clr = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
+	}
 
 	pmu_reset_all_counters();
 }
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index 885b53f1..e2c0bdf4 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -89,6 +89,11 @@ static inline bool this_cpu_has_perf_global_ctrl(void)
 	return pmu.version > 1;
 }
 
+static inline bool this_cpu_has_perf_global_status(void)
+{
+	return pmu.version > 1;
+}
+
 static inline bool pmu_gp_counter_is_available(int i)
 {
 	return pmu.gp_counter_available & BIT(i);
diff --git a/x86/pmu.c b/x86/pmu.c
index 3cca5b9c..7f200658 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -102,12 +102,18 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
 
 static void global_enable(pmu_counter_t *cnt)
 {
+	if (!this_cpu_has_perf_global_ctrl())
+		return;
+
 	cnt->idx = event_to_global_idx(cnt);
 	wrmsr(pmu.msr_global_ctl, rdmsr(pmu.msr_global_ctl) | BIT_ULL(cnt->idx));
 }
 
 static void global_disable(pmu_counter_t *cnt)
 {
+	if (!this_cpu_has_perf_global_ctrl())
+		return;
+
 	wrmsr(pmu.msr_global_ctl, rdmsr(pmu.msr_global_ctl) & ~BIT_ULL(cnt->idx));
 }
 
@@ -283,7 +289,8 @@ static void check_counter_overflow(void)
 	overflow_preset = measure_for_overflow(&cnt);
 
 	/* clear status before test */
-	pmu_clear_global_status();
+	if (this_cpu_has_perf_global_status())
+		pmu_clear_global_status();
 
 	report_prefix_push("overflow");
 
@@ -310,6 +317,10 @@ static void check_counter_overflow(void)
 		idx = event_to_global_idx(&cnt);
 		__measure(&cnt, cnt.count);
 		report(cnt.count == 1, "cntr-%d", i);
+
+		if (!this_cpu_has_perf_global_status())
+			continue;
+
 		status = rdmsr(pmu.msr_global_status);
 		report(status & (1ull << idx), "status-%d", i);
 		wrmsr(pmu.msr_global_status_clr, status);
@@ -418,7 +429,8 @@ static void check_running_counter_wrmsr(void)
 	report(evt.count < gp_events[1].min, "cntr");
 
 	/* clear status before overflow test */
-	pmu_clear_global_status();
+	if (this_cpu_has_perf_global_status())
+		pmu_clear_global_status();
 
 	start_event(&evt);
 
@@ -430,8 +442,11 @@ static void check_running_counter_wrmsr(void)
 
 	loop();
 	stop_event(&evt);
-	status = rdmsr(pmu.msr_global_status);
-	report(status & 1, "status msr bit");
+
+	if (this_cpu_has_perf_global_status()) {
+		status = rdmsr(pmu.msr_global_status);
+		report(status & 1, "status msr bit");
+	}
 
 	report_prefix_pop();
 }
@@ -451,7 +466,8 @@ static void check_emulated_instr(void)
 	};
 	report_prefix_push("emulated instruction");
 
-	pmu_clear_global_status();
+	if (this_cpu_has_perf_global_status())
+		pmu_clear_global_status();
 
 	start_event(&brnch_cnt);
 	start_event(&instr_cnt);
@@ -485,7 +501,8 @@ static void check_emulated_instr(void)
 		:
 		: "eax", "ebx", "ecx", "edx");
 
-	wrmsr(pmu.msr_global_ctl, 0);
+	if (this_cpu_has_perf_global_ctrl())
+		wrmsr(pmu.msr_global_ctl, 0);
 
 	stop_event(&brnch_cnt);
 	stop_event(&instr_cnt);
@@ -496,10 +513,12 @@ static void check_emulated_instr(void)
 	       "instruction count");
 	report(brnch_cnt.count - brnch_start >= EXPECTED_BRNCH,
 	       "branch count");
-	// Additionally check that those counters overflowed properly.
-	status = rdmsr(pmu.msr_global_status);
-	report(status & 1, "branch counter overflow");
-	report(status & 2, "instruction counter overflow");
+	if (this_cpu_has_perf_global_status()) {
+		// Additionally check that those counters overflowed properly.
+		status = rdmsr(pmu.msr_global_status);
+		report(status & 1, "branch counter overflow");
+		report(status & 2, "instruction counter overflow");
+	}
 
 	report_prefix_pop();
 }
@@ -585,7 +604,8 @@ static void set_ref_cycle_expectations(void)
 	if (!pmu.nr_gp_counters || !pmu_gp_counter_is_available(2))
 		return;
 
-	wrmsr(pmu.msr_global_ctl, 0);
+	if (this_cpu_has_perf_global_ctrl())
+		wrmsr(pmu.msr_global_ctl, 0);
 
 	t0 = fenced_rdtsc();
 	start_event(&cnt);
@@ -636,11 +656,6 @@ int main(int ac, char **av)
 		return report_summary();
 	}
 
-	if (pmu.version == 1) {
-		report_skip("PMU version 1 is not supported.");
-		return report_summary();
-	}
-
 	set_ref_cycle_expectations();
 
 	printf("PMU version:         %d\n", pmu.version);
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 24/27] x86/pmu: Add gp_events pointer to route different event tables
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (22 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 23/27] x86/pmu: Add global helpers to cover Intel Arch PMU Version 1 Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 25/27] x86/pmu: Add pmu_caps flag to track if CPU is Intel (versus AMD) Sean Christopherson
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

AMD and Intel do not share the same set of coding rules for performance
events, and code to test the same performance event can be reused by
pointing to a different coding table, noting that the table size also
needs to be updated.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 x86/pmu.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 7f200658..c40e2a96 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -30,7 +30,7 @@ struct pmu_event {
 	uint32_t unit_sel;
 	int min;
 	int max;
-} gp_events[] = {
+} intel_gp_events[] = {
 	{"core cycles", 0x003c, 1*N, 50*N},
 	{"instructions", 0x00c0, 10*N, 10.2*N},
 	{"ref cycles", 0x013c, 1*N, 30*N},
@@ -46,6 +46,9 @@ struct pmu_event {
 
 char *buf;
 
+static struct pmu_event *gp_events;
+static unsigned int gp_events_size;
+
 static inline void loop(void)
 {
 	unsigned long tmp, tmp2, tmp3;
@@ -91,7 +94,7 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
 	if (is_gp(cnt)) {
 		int i;
 
-		for (i = 0; i < sizeof(gp_events)/sizeof(gp_events[0]); i++)
+		for (i = 0; i < gp_events_size; i++)
 			if (gp_events[i].unit_sel == (cnt->config & 0xffff))
 				return &gp_events[i];
 	} else
@@ -213,7 +216,7 @@ static void check_gp_counters(void)
 {
 	int i;
 
-	for (i = 0; i < sizeof(gp_events)/sizeof(gp_events[0]); i++)
+	for (i = 0; i < gp_events_size; i++)
 		if (pmu_gp_counter_is_available(i))
 			check_gp_counter(&gp_events[i]);
 		else
@@ -246,7 +249,7 @@ static void check_counters_many(void)
 
 		cnt[n].ctr = MSR_GP_COUNTERx(n);
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR |
-			gp_events[i % ARRAY_SIZE(gp_events)].unit_sel;
+			gp_events[i % gp_events_size].unit_sel;
 		n++;
 	}
 	for (i = 0; i < pmu.nr_fixed_counters; i++) {
@@ -595,7 +598,7 @@ static void set_ref_cycle_expectations(void)
 {
 	pmu_counter_t cnt = {
 		.ctr = MSR_IA32_PERFCTR0,
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[2].unit_sel,
+		.config = EVNTSEL_OS | EVNTSEL_USR | intel_gp_events[2].unit_sel,
 	};
 	uint64_t tsc_delta;
 	uint64_t t0, t1, t2, t3;
@@ -631,8 +634,8 @@ static void set_ref_cycle_expectations(void)
 	if (!tsc_delta)
 		return;
 
-	gp_events[2].min = (gp_events[2].min * cnt.count) / tsc_delta;
-	gp_events[2].max = (gp_events[2].max * cnt.count) / tsc_delta;
+	intel_gp_events[2].min = (intel_gp_events[2].min * cnt.count) / tsc_delta;
+	intel_gp_events[2].max = (intel_gp_events[2].max * cnt.count) / tsc_delta;
 }
 
 static void check_invalid_rdpmc_gp(void)
@@ -656,6 +659,8 @@ int main(int ac, char **av)
 		return report_summary();
 	}
 
+	gp_events = (struct pmu_event *)intel_gp_events;
+	gp_events_size = sizeof(intel_gp_events)/sizeof(intel_gp_events[0]);
 	set_ref_cycle_expectations();
 
 	printf("PMU version:         %d\n", pmu.version);
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 25/27] x86/pmu: Add pmu_caps flag to track if CPU is Intel (versus AMD)
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (23 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 24/27] x86/pmu: Add gp_events pointer to route different event tables Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 26/27] x86/pmu: Update testcases to cover AMD PMU Sean Christopherson
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

Add a flag to track whether the PMU is backed by an Intel CPU.  Future
support for AMD will sadly need to constantly check whether the PMU is
Intel or AMD, and invoking is_intel() every time is rather expensive due
to it requiring CPUID (VM-Exit) and a string comparison.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/pmu.c  | 5 +++++
 lib/x86/pmu.h  | 1 +
 x86/pmu_lbr.c  | 2 +-
 x86/pmu_pebs.c | 2 +-
 4 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index ea4859df..837d2a6c 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -6,6 +6,11 @@ void pmu_init(void)
 {
 	struct cpuid cpuid_10 = cpuid(10);
 
+	pmu.is_intel = is_intel();
+
+	if (!pmu.is_intel)
+		return;
+
 	pmu.version = cpuid_10.a & 0xff;
 
 	if (pmu.version > 1) {
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index e2c0bdf4..460e2a19 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -48,6 +48,7 @@
 #define MAX_NUM_LBR_ENTRY	32
 
 struct pmu_caps {
+	bool is_intel;
 	u8 version;
 	u8 nr_fixed_counters;
 	u8 fixed_counter_width;
diff --git a/x86/pmu_lbr.c b/x86/pmu_lbr.c
index 36c9a8fa..40b63fa3 100644
--- a/x86/pmu_lbr.c
+++ b/x86/pmu_lbr.c
@@ -47,7 +47,7 @@ int main(int ac, char **av)
 
 	setup_vm();
 
-	if (!is_intel()) {
+	if (!pmu.is_intel) {
 		report_skip("PMU_LBR test is for intel CPU's only");
 		return report_summary();
 	}
diff --git a/x86/pmu_pebs.c b/x86/pmu_pebs.c
index 3b6bcb2c..894ae6c7 100644
--- a/x86/pmu_pebs.c
+++ b/x86/pmu_pebs.c
@@ -392,7 +392,7 @@ int main(int ac, char **av)
 	if (pmu_has_full_writes())
 		pmu_activate_full_writes();
 
-	if (!is_intel()) {
+	if (!pmu.is_intel) {
 		report_skip("PEBS requires Intel ICX or later, non-Intel detected");
 		return report_summary();
 	} else if (!pmu_has_pebs()) {
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 26/27] x86/pmu: Update testcases to cover AMD PMU
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (24 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 25/27] x86/pmu: Add pmu_caps flag to track if CPU is Intel (versus AMD) Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-08  9:53   ` Paolo Bonzini
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 27/27] x86/pmu: Add AMD Guest PerfMonV2 testcases Sean Christopherson
                   ` (2 subsequent siblings)
  28 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

AMD core PMU before Zen4 did not have version numbers, there were
no fixed counters, it had a hard-coded number of generic counters,
bit-width, and only hardware events common across amd generations
(starting with K7) were added to amd_gp_events[] table.

All above differences are instantiated at the detection step, and it
also covers the K7 PMU registers, which is consistent with bare-metal.

Cc: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/msr.h       | 17 +++++++++++++
 lib/x86/pmu.c       | 59 ++++++++++++++++++++++++++++-----------------
 lib/x86/pmu.h       | 13 +++++++++-
 lib/x86/processor.h |  1 +
 x86/pmu.c           | 58 +++++++++++++++++++++++++++++++++++---------
 5 files changed, 114 insertions(+), 34 deletions(-)

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index 68d88371..6cf8f336 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -146,6 +146,23 @@
 #define FAM10H_MMIO_CONF_BASE_SHIFT	20
 #define MSR_FAM10H_NODE_ID		0xc001100c
 
+/* Fam 15h MSRs */
+#define MSR_F15H_PERF_CTL              0xc0010200
+#define MSR_F15H_PERF_CTL0             MSR_F15H_PERF_CTL
+#define MSR_F15H_PERF_CTL1             (MSR_F15H_PERF_CTL + 2)
+#define MSR_F15H_PERF_CTL2             (MSR_F15H_PERF_CTL + 4)
+#define MSR_F15H_PERF_CTL3             (MSR_F15H_PERF_CTL + 6)
+#define MSR_F15H_PERF_CTL4             (MSR_F15H_PERF_CTL + 8)
+#define MSR_F15H_PERF_CTL5             (MSR_F15H_PERF_CTL + 10)
+
+#define MSR_F15H_PERF_CTR              0xc0010201
+#define MSR_F15H_PERF_CTR0             MSR_F15H_PERF_CTR
+#define MSR_F15H_PERF_CTR1             (MSR_F15H_PERF_CTR + 2)
+#define MSR_F15H_PERF_CTR2             (MSR_F15H_PERF_CTR + 4)
+#define MSR_F15H_PERF_CTR3             (MSR_F15H_PERF_CTR + 6)
+#define MSR_F15H_PERF_CTR4             (MSR_F15H_PERF_CTR + 8)
+#define MSR_F15H_PERF_CTR5             (MSR_F15H_PERF_CTR + 10)
+
 /* K8 MSRs */
 #define MSR_K8_TOP_MEM1			0xc001001a
 #define MSR_K8_TOP_MEM2			0xc001001d
diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index 837d2a6c..090e1115 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -4,36 +4,51 @@ struct pmu_caps pmu;
 
 void pmu_init(void)
 {
-	struct cpuid cpuid_10 = cpuid(10);
-
 	pmu.is_intel = is_intel();
 
-	if (!pmu.is_intel)
-		return;
+	if (pmu.is_intel) {
+		struct cpuid cpuid_10 = cpuid(10);
 
-	pmu.version = cpuid_10.a & 0xff;
+		pmu.version = cpuid_10.a & 0xff;
 
-	if (pmu.version > 1) {
-		pmu.nr_fixed_counters = cpuid_10.d & 0x1f;
-		pmu.fixed_counter_width = (cpuid_10.d >> 5) & 0xff;
-	}
+		if (pmu.version > 1) {
+			pmu.nr_fixed_counters = cpuid_10.d & 0x1f;
+			pmu.fixed_counter_width = (cpuid_10.d >> 5) & 0xff;
+		}
 
-	pmu.nr_gp_counters = (cpuid_10.a >> 8) & 0xff;
-	pmu.gp_counter_width = (cpuid_10.a >> 16) & 0xff;
-	pmu.gp_counter_mask_length = (cpuid_10.a >> 24) & 0xff;
+		if (pmu.version > 1) {
+			pmu.nr_fixed_counters = cpuid_10.d & 0x1f;
+			pmu.fixed_counter_width = (cpuid_10.d >> 5) & 0xff;
+		}
 
-	/* CPUID.0xA.EBX bit is '1' if a counter is NOT available. */
-	pmu.gp_counter_available = ~cpuid_10.b;
+		pmu.nr_gp_counters = (cpuid_10.a >> 8) & 0xff;
+		pmu.gp_counter_width = (cpuid_10.a >> 16) & 0xff;
+		pmu.gp_counter_mask_length = (cpuid_10.a >> 24) & 0xff;
 
-	if (this_cpu_has(X86_FEATURE_PDCM))
-		pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
-	pmu.msr_gp_counter_base = MSR_IA32_PERFCTR0;
-	pmu.msr_gp_event_select_base = MSR_P6_EVNTSEL0;
+		/* CPUID.0xA.EBX bit is '1' if a counter is NOT available. */
+		pmu.gp_counter_available = ~cpuid_10.b;
 
-	if (this_cpu_has_perf_global_status()) {
-		pmu.msr_global_status = MSR_CORE_PERF_GLOBAL_STATUS;
-		pmu.msr_global_ctl = MSR_CORE_PERF_GLOBAL_CTRL;
-		pmu.msr_global_status_clr = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
+		if (this_cpu_has(X86_FEATURE_PDCM))
+			pmu.perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
+		pmu.msr_gp_counter_base = MSR_IA32_PERFCTR0;
+		pmu.msr_gp_event_select_base = MSR_P6_EVNTSEL0;
+
+		if (this_cpu_has_perf_global_status()) {
+			pmu.msr_global_status = MSR_CORE_PERF_GLOBAL_STATUS;
+			pmu.msr_global_ctl = MSR_CORE_PERF_GLOBAL_CTRL;
+			pmu.msr_global_status_clr = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
+		}
+	} else {
+		pmu.msr_gp_counter_base = MSR_F15H_PERF_CTR0;
+		pmu.msr_gp_event_select_base = MSR_F15H_PERF_CTL0;
+		if (!this_cpu_has(X86_FEATURE_PERFCTR_CORE))
+			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
+		else
+			pmu.nr_gp_counters = AMD64_NUM_COUNTERS_CORE;
+
+		pmu.gp_counter_width = PMC_DEFAULT_WIDTH;
+		pmu.gp_counter_mask_length = pmu.nr_gp_counters;
+		pmu.gp_counter_available = (1u << pmu.nr_gp_counters) - 1;
 	}
 
 	pmu_reset_all_counters();
diff --git a/lib/x86/pmu.h b/lib/x86/pmu.h
index 460e2a19..8465e3c9 100644
--- a/lib/x86/pmu.h
+++ b/lib/x86/pmu.h
@@ -10,6 +10,11 @@
 /* Performance Counter Vector for the LVT PC Register */
 #define PMI_VECTOR	32
 
+#define AMD64_NUM_COUNTERS	4
+#define AMD64_NUM_COUNTERS_CORE	6
+
+#define PMC_DEFAULT_WIDTH	48
+
 #define DEBUGCTLMSR_LBR	  (1UL <<  0)
 
 #define PMU_CAP_LBR_FMT	  0x3f
@@ -72,17 +77,23 @@ void pmu_init(void);
 
 static inline u32 MSR_GP_COUNTERx(unsigned int i)
 {
+	if (pmu.msr_gp_counter_base == MSR_F15H_PERF_CTR0)
+		return pmu.msr_gp_counter_base + 2 * i;
+
 	return pmu.msr_gp_counter_base + i;
 }
 
 static inline u32 MSR_GP_EVENT_SELECTx(unsigned int i)
 {
+	if (pmu.msr_gp_event_select_base == MSR_F15H_PERF_CTL0)
+		return pmu.msr_gp_event_select_base + 2 * i;
+
 	return pmu.msr_gp_event_select_base + i;
 }
 
 static inline bool this_cpu_has_pmu(void)
 {
-	return !!pmu.version;
+	return !pmu.is_intel || !!pmu.version;
 }
 
 static inline bool this_cpu_has_perf_global_ctrl(void)
diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index c0716663..681e1675 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -252,6 +252,7 @@ static inline bool is_intel(void)
  * Extended Leafs, a.k.a. AMD defined
  */
 #define	X86_FEATURE_SVM			(CPUID(0x80000001, 0, ECX, 2))
+#define	X86_FEATURE_PERFCTR_CORE	(CPUID(0x80000001, 0, ECX, 23))
 #define	X86_FEATURE_NX			(CPUID(0x80000001, 0, EDX, 20))
 #define	X86_FEATURE_GBPAGES		(CPUID(0x80000001, 0, EDX, 26))
 #define	X86_FEATURE_RDTSCP		(CPUID(0x80000001, 0, EDX, 27))
diff --git a/x86/pmu.c b/x86/pmu.c
index c40e2a96..72c2c9cf 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -38,6 +38,11 @@ struct pmu_event {
 	{"llc misses", 0x412e, 1, 1*N},
 	{"branches", 0x00c4, 1*N, 1.1*N},
 	{"branch misses", 0x00c5, 0, 0.1*N},
+}, amd_gp_events[] = {
+	{"core cycles", 0x0076, 1*N, 50*N},
+	{"instructions", 0x00c0, 10*N, 10.2*N},
+	{"branches", 0x00c2, 1*N, 1.1*N},
+	{"branch misses", 0x00c3, 0, 0.1*N},
 }, fixed_events[] = {
 	{"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
 	{"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
@@ -79,14 +84,23 @@ static bool check_irq(void)
 
 static bool is_gp(pmu_counter_t *evt)
 {
+	if (!pmu.is_intel)
+		return true;
+
 	return evt->ctr < MSR_CORE_PERF_FIXED_CTR0 ||
 		evt->ctr >= MSR_IA32_PMC0;
 }
 
 static int event_to_global_idx(pmu_counter_t *cnt)
 {
-	return cnt->ctr - (is_gp(cnt) ? pmu.msr_gp_counter_base :
-		(MSR_CORE_PERF_FIXED_CTR0 - FIXED_CNT_INDEX));
+	if (pmu.is_intel)
+		return cnt->ctr - (is_gp(cnt) ? pmu.msr_gp_counter_base :
+			(MSR_CORE_PERF_FIXED_CTR0 - FIXED_CNT_INDEX));
+
+	if (pmu.msr_gp_counter_base == MSR_F15H_PERF_CTR0)
+		return (cnt->ctr - pmu.msr_gp_counter_base) / 2;
+	else
+		return cnt->ctr - pmu.msr_gp_counter_base;
 }
 
 static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
@@ -306,6 +320,9 @@ static void check_counter_overflow(void)
 			cnt.count &= (1ull << pmu.gp_counter_width) - 1;
 
 		if (i == pmu.nr_gp_counters) {
+			if (!pmu.is_intel)
+				break;
+
 			cnt.ctr = fixed_events[0].unit_sel;
 			cnt.count = measure_for_overflow(&cnt);
 			cnt.count &= (1ull << pmu.gp_counter_width) - 1;
@@ -319,7 +336,10 @@ static void check_counter_overflow(void)
 			cnt.config &= ~EVNTSEL_INT;
 		idx = event_to_global_idx(&cnt);
 		__measure(&cnt, cnt.count);
-		report(cnt.count == 1, "cntr-%d", i);
+		if (pmu.is_intel)
+			report(cnt.count == 1, "cntr-%d", i);
+		else
+			report(cnt.count == 0xffffffffffff || cnt.count < 7, "cntr-%d", i);
 
 		if (!this_cpu_has_perf_global_status())
 			continue;
@@ -457,10 +477,11 @@ static void check_running_counter_wrmsr(void)
 static void check_emulated_instr(void)
 {
 	uint64_t status, instr_start, brnch_start;
+	unsigned int branch_idx = pmu.is_intel ? 5 : 2;
 	pmu_counter_t brnch_cnt = {
 		.ctr = MSR_GP_COUNTERx(0),
 		/* branch instructions */
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[5].unit_sel,
+		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[branch_idx].unit_sel,
 	};
 	pmu_counter_t instr_cnt = {
 		.ctr = MSR_GP_COUNTERx(1),
@@ -654,15 +675,21 @@ int main(int ac, char **av)
 
 	check_invalid_rdpmc_gp();
 
-	if (!pmu.version) {
-		report_skip("No Intel Arch PMU is detected!");
-		return report_summary();
+	if (pmu.is_intel) {
+		if (!pmu.version) {
+			report_skip("No Intel Arch PMU is detected!");
+			return report_summary();
+		}
+		gp_events = (struct pmu_event *)intel_gp_events;
+		gp_events_size = sizeof(intel_gp_events)/sizeof(intel_gp_events[0]);
+		report_prefix_push("Intel");
+		set_ref_cycle_expectations();
+	} else {
+		gp_events_size = sizeof(amd_gp_events)/sizeof(amd_gp_events[0]);
+		gp_events = (struct pmu_event *)amd_gp_events;
+		report_prefix_push("AMD");
 	}
 
-	gp_events = (struct pmu_event *)intel_gp_events;
-	gp_events_size = sizeof(intel_gp_events)/sizeof(intel_gp_events[0]);
-	set_ref_cycle_expectations();
-
 	printf("PMU version:         %d\n", pmu.version);
 	printf("GP counters:         %d\n", pmu.nr_gp_counters);
 	printf("GP counter width:    %d\n", pmu.gp_counter_width);
@@ -683,5 +710,14 @@ int main(int ac, char **av)
 		report_prefix_pop();
 	}
 
+	if (!pmu.is_intel) {
+		report_prefix_push("K7");
+		pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
+		pmu.msr_gp_counter_base = MSR_K7_PERFCTR0;
+		pmu.msr_gp_event_select_base = MSR_K7_EVNTSEL0;
+		check_counters();
+		report_prefix_pop();
+	}
+
 	return report_summary();
 }
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [kvm-unit-tests PATCH v5 27/27] x86/pmu: Add AMD Guest PerfMonV2 testcases
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (25 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 26/27] x86/pmu: Update testcases to cover AMD PMU Sean Christopherson
@ 2022-11-02 22:51 ` Sean Christopherson
  2022-11-07  7:02 ` [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Like Xu
  2022-11-07 17:50 ` Paolo Bonzini
  28 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-02 22:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Sean Christopherson, Like Xu, Sandipan Das

From: Like Xu <likexu@tencent.com>

Updated test cases to cover KVM enabling code for AMD Guest PerfMonV2.

The Intel-specific PMU helpers were added to check for AMD cpuid, and
some of the same semantics of MSRs were assigned during the initialization
phase. The vast majority of pmu test cases are reused seamlessly.

On some x86 machines (AMD only), even with retired events, the same
workload is measured repeatedly and the number of events collected is
erratic, which essentially reflects the details of hardware implementation,
and from a software perspective, the type of event is an unprecise event,
which brings a tolerance check in the counter overflow testcases.

Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 lib/x86/msr.h       |  5 +++++
 lib/x86/pmu.c       | 14 +++++++++++++-
 lib/x86/processor.h |  2 +-
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index 6cf8f336..c9869be5 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -426,6 +426,11 @@
 #define MSR_CORE_PERF_GLOBAL_CTRL	0x0000038f
 #define MSR_CORE_PERF_GLOBAL_OVF_CTRL	0x00000390
 
+/* AMD Performance Counter Global Status and Control MSRs */
+#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS	0xc0000300
+#define MSR_AMD64_PERF_CNTR_GLOBAL_CTL		0xc0000301
+#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR	0xc0000302
+
 /* Geode defined MSRs */
 #define MSR_GEODE_BUSCONT_CONF0		0x00001900
 
diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index 090e1115..af68f3a8 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -39,9 +39,15 @@ void pmu_init(void)
 			pmu.msr_global_status_clr = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
 		}
 	} else {
+		/* Performance Monitoring Version 2 Supported */
+		if (this_cpu_has(X86_FEATURE_AMD_PMU_V2))
+			pmu.version = 2;
+
 		pmu.msr_gp_counter_base = MSR_F15H_PERF_CTR0;
 		pmu.msr_gp_event_select_base = MSR_F15H_PERF_CTL0;
-		if (!this_cpu_has(X86_FEATURE_PERFCTR_CORE))
+		if (this_cpu_has(X86_FEATURE_AMD_PMU_V2))
+			pmu.nr_gp_counters = cpuid(0x80000022).b & 0xf;
+		else if (!this_cpu_has(X86_FEATURE_PERFCTR_CORE))
 			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
 		else
 			pmu.nr_gp_counters = AMD64_NUM_COUNTERS_CORE;
@@ -49,6 +55,12 @@ void pmu_init(void)
 		pmu.gp_counter_width = PMC_DEFAULT_WIDTH;
 		pmu.gp_counter_mask_length = pmu.nr_gp_counters;
 		pmu.gp_counter_available = (1u << pmu.nr_gp_counters) - 1;
+
+		if (this_cpu_has_perf_global_status()) {
+			pmu.msr_global_status = MSR_AMD64_PERF_CNTR_GLOBAL_STATUS;
+			pmu.msr_global_ctl = MSR_AMD64_PERF_CNTR_GLOBAL_CTL;
+			pmu.msr_global_status_clr = MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR;
+		}
 	}
 
 	pmu_reset_all_counters();
diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index 681e1675..72bdc833 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -266,7 +266,7 @@ static inline bool is_intel(void)
 #define X86_FEATURE_PAUSEFILTER		(CPUID(0x8000000A, 0, EDX, 10))
 #define X86_FEATURE_PFTHRESHOLD		(CPUID(0x8000000A, 0, EDX, 12))
 #define	X86_FEATURE_VGIF		(CPUID(0x8000000A, 0, EDX, 16))
-
+#define	X86_FEATURE_AMD_PMU_V2		(CPUID(0x80000022, 0, EAX, 0))
 
 static inline bool this_cpu_has(u64 feature)
 {
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (26 preceding siblings ...)
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 27/27] x86/pmu: Add AMD Guest PerfMonV2 testcases Sean Christopherson
@ 2022-11-07  7:02 ` Like Xu
  2022-11-07 17:50 ` Paolo Bonzini
  28 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2022-11-07  7:02 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Sandipan Das, Paolo Bonzini

On 3/11/2022 6:50 am, Sean Christopherson wrote:
> There are no major changes in the test logic. The big cleanups are to add
> lib/x86/pmu.[c,h] and a global PMU capabilities struct to improve
> readability of the code and to hide some AMD vs. Intel details.

The extra patch helps. Thank you.

> 
> Like's v4 was tested on AMD Zen3/4 and Intel ICX/SPR machines, but this
> version has only been tested on AMD Zen3 (Milan) and Intel ICX and HSW,
> i.e. I haven't tested AMD PMU v2 or anything new in SPR (if there is
> anything in SPR?).

V5 tests passed on AMD Zen 4 (AMD PMU v2) and Intel SPR (for PEBS).
Please move forward.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions
  2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
                   ` (27 preceding siblings ...)
  2022-11-07  7:02 ` [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Like Xu
@ 2022-11-07 17:50 ` Paolo Bonzini
  28 siblings, 0 replies; 34+ messages in thread
From: Paolo Bonzini @ 2022-11-07 17:50 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Like Xu, Sandipan Das

On 11/2/22 23:50, Sean Christopherson wrote:
> This series is a big pile of PMU cleanups and enhancements from Like.
> 
> The changes are roughly divided into three parts: (1) fixes (2) cleanups,
> and (3) new test cases.  The changes are bundled in a mega-series as the
> original, separate series was difficult to review/manage due to a number
> of dependencies.
> 
> There are no major changes in the test logic. The big cleanups are to add
> lib/x86/pmu.[c,h] and a global PMU capabilities struct to improve
> readability of the code and to hide some AMD vs. Intel details.
> 
> Like's v4 was tested on AMD Zen3/4 and Intel ICX/SPR machines, but this
> version has only been tested on AMD Zen3 (Milan) and Intel ICX and HSW,
> i.e. I haven't tested AMD PMU v2 or anything new in SPR (if there is
> anything in SPR?).
> 
> Like Xu (22):
>    x86/pmu: Add PDCM check before accessing PERF_CAP register
>    x86/pmu: Test emulation instructions on full-width counters
>    x86/pmu: Pop up FW prefix to avoid out-of-context propagation
>    x86/pmu: Report SKIP when testing Intel LBR on AMD platforms
>    x86/pmu: Fix printed messages for emulated instruction test
>    x86/pmu: Introduce __start_event() to drop all of the manual zeroing
>    x86/pmu: Introduce multiple_{one, many}() to improve readability
>    x86/pmu: Reset the expected count of the fixed counter 0 when i386
>    x86: create pmu group for quick pmu-scope testing
>    x86/pmu: Refine info to clarify the current support
>    x86/pmu: Update rdpmc testcase to cover #GP path
>    x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability
>    x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files
>    x86/pmu: Snapshot PMU perf_capabilities during BSP initialization
>    x86/pmu: Track GP counter and event select base MSRs in pmu_caps
>    x86/pmu: Add helper to get fixed counter MSR index
>    x86/pmu: Track global status/control/clear MSRs in pmu_caps
>    x86: Add tests for Guest Processor Event Based Sampling (PEBS)
>    x86/pmu: Add global helpers to cover Intel Arch PMU Version 1
>    x86/pmu: Add gp_events pointer to route different event tables
>    x86/pmu: Update testcases to cover AMD PMU
>    x86/pmu: Add AMD Guest PerfMonV2 testcases
> 
> Sean Christopherson (5):
>    x86: Add a helper for the BSP's final init sequence common to all
>      flavors
>    x86/pmu: Snapshot CPUID.0xA PMU capabilities during BSP initialization
>    x86/pmu: Drop wrappers that just passthrough pmu_caps fields
>    x86/pmu: Reset GP and Fixed counters during pmu_init().
>    x86/pmu: Add pmu_caps flag to track if CPU is Intel (versus AMD)
> 
>   lib/x86/asm/setup.h |   1 +
>   lib/x86/msr.h       |  30 +++
>   lib/x86/pmu.c       |  67 +++++++
>   lib/x86/pmu.h       | 187 +++++++++++++++++++
>   lib/x86/processor.h |  80 ++------
>   lib/x86/setup.c     |  13 +-
>   x86/Makefile.common |   1 +
>   x86/Makefile.x86_64 |   1 +
>   x86/cstart.S        |   4 +-
>   x86/cstart64.S      |   4 +-
>   x86/pmu.c           | 360 ++++++++++++++++++++----------------
>   x86/pmu_lbr.c       |  24 +--
>   x86/pmu_pebs.c      | 433 ++++++++++++++++++++++++++++++++++++++++++++
>   x86/unittests.cfg   |  10 +
>   x86/vmx_tests.c     |   1 +
>   15 files changed, 975 insertions(+), 241 deletions(-)
>   create mode 100644 lib/x86/pmu.c
>   create mode 100644 lib/x86/pmu.h
>   create mode 100644 x86/pmu_pebs.c

Applied, thanks.

Paolo


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [kvm-unit-tests PATCH v5 26/27] x86/pmu: Update testcases to cover AMD PMU
  2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 26/27] x86/pmu: Update testcases to cover AMD PMU Sean Christopherson
@ 2022-11-08  9:53   ` Paolo Bonzini
  2022-11-09  0:52     ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Paolo Bonzini @ 2022-11-08  9:53 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Like Xu, Sandipan Das

On 11/2/22 23:51, Sean Christopherson wrote:
> +		pmu.msr_gp_counter_base = MSR_F15H_PERF_CTR0;
> +		pmu.msr_gp_event_select_base = MSR_F15H_PERF_CTL0;
> +		if (!this_cpu_has(X86_FEATURE_PERFCTR_CORE))
> +			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
> +		else
> +			pmu.nr_gp_counters = AMD64_NUM_COUNTERS_CORE;
> +

If X86_FEATURE_PERFCTR_CORE is not set, pmu.msr_gp_*_base should point 
to MSR_K7_PERFCTR0/MSR_K7_EVNTSEL0:

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index af68f3a..8d5f69f 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -47,10 +47,13 @@ void pmu_init(void)
  		pmu.msr_gp_event_select_base = MSR_F15H_PERF_CTL0;
  		if (this_cpu_has(X86_FEATURE_AMD_PMU_V2))
  			pmu.nr_gp_counters = cpuid(0x80000022).b & 0xf;
-		else if (!this_cpu_has(X86_FEATURE_PERFCTR_CORE))
-			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
-		else
+		else if (this_cpu_has(X86_FEATURE_PERFCTR_CORE))
  			pmu.nr_gp_counters = AMD64_NUM_COUNTERS_CORE;
+		else {
+			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
+			pmu.msr_gp_counter_base = MSR_K7_PERFCTR0;
+			pmu.msr_gp_event_select_base = MSR_K7_EVNTSEL0;
+		}

  		pmu.gp_counter_width = PMC_DEFAULT_WIDTH;
  		pmu.gp_counter_mask_length = pmu.nr_gp_counters;

Paolo


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [kvm-unit-tests PATCH v5 26/27] x86/pmu: Update testcases to cover AMD PMU
  2022-11-08  9:53   ` Paolo Bonzini
@ 2022-11-09  0:52     ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-11-09  0:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Like Xu, Sandipan Das

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> On 11/2/22 23:51, Sean Christopherson wrote:
> > +		pmu.msr_gp_counter_base = MSR_F15H_PERF_CTR0;
> > +		pmu.msr_gp_event_select_base = MSR_F15H_PERF_CTL0;
> > +		if (!this_cpu_has(X86_FEATURE_PERFCTR_CORE))
> > +			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
> > +		else
> > +			pmu.nr_gp_counters = AMD64_NUM_COUNTERS_CORE;
> > +
> 
> If X86_FEATURE_PERFCTR_CORE is not set, pmu.msr_gp_*_base should point to
> MSR_K7_PERFCTR0/MSR_K7_EVNTSEL0:

/facepalm

I only ran the PMU tests, which all passthrough the relevant host CPUID.

Glad you debugged this, all tests were failing on Milan due to this.  I shudder
to think about how long it would have taken me to figure this out.

Thanks!

> diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
> index af68f3a..8d5f69f 100644
> --- a/lib/x86/pmu.c
> +++ b/lib/x86/pmu.c
> @@ -47,10 +47,13 @@ void pmu_init(void)
>  		pmu.msr_gp_event_select_base = MSR_F15H_PERF_CTL0;
>  		if (this_cpu_has(X86_FEATURE_AMD_PMU_V2))
>  			pmu.nr_gp_counters = cpuid(0x80000022).b & 0xf;
> -		else if (!this_cpu_has(X86_FEATURE_PERFCTR_CORE))
> -			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
> -		else
> +		else if (this_cpu_has(X86_FEATURE_PERFCTR_CORE))

Nit, the if and else-if statements should also have braces.

>  			pmu.nr_gp_counters = AMD64_NUM_COUNTERS_CORE;
> +		else {
> +			pmu.nr_gp_counters = AMD64_NUM_COUNTERS;
> +			pmu.msr_gp_counter_base = MSR_K7_PERFCTR0;
> +			pmu.msr_gp_event_select_base = MSR_K7_EVNTSEL0;
> +		}
> 
>  		pmu.gp_counter_width = PMC_DEFAULT_WIDTH;
>  		pmu.gp_counter_mask_length = pmu.nr_gp_counters;
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [kvm-unit-tests PATCH v5 11/27] x86/pmu: Update rdpmc testcase to cover #GP path
  2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 11/27] x86/pmu: Update rdpmc testcase to cover #GP path Sean Christopherson
@ 2022-11-24 11:33   ` Thomas Huth
  2022-11-24 11:52     ` Like Xu
  0 siblings, 1 reply; 34+ messages in thread
From: Thomas Huth @ 2022-11-24 11:33 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, Like Xu, Sandipan Das

On 02/11/2022 23.50, Sean Christopherson wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Specifying an unsupported PMC encoding will cause a #GP(0).
> 
> There are multiple reasons RDPMC can #GP, the one that is being relied
> on to guarantee #GP is specifically that the PMC is invalid. The most
> extensible solution is to provide a safe variant.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   lib/x86/processor.h | 21 ++++++++++++++++++---
>   x86/pmu.c           | 10 ++++++++++
>   2 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/x86/processor.h b/lib/x86/processor.h
> index f85abe36..ba14c7a0 100644
> --- a/lib/x86/processor.h
> +++ b/lib/x86/processor.h
> @@ -438,11 +438,26 @@ static inline int wrmsr_safe(u32 index, u64 val)
>   	return exception_vector();
>   }
>   
> +static inline int rdpmc_safe(u32 index, uint64_t *val)
> +{
> +	uint32_t a, d;
> +
> +	asm volatile (ASM_TRY("1f")
> +		      "rdpmc\n\t"
> +		      "1:"
> +		      : "=a"(a), "=d"(d) : "c"(index) : "memory");
> +	*val = (uint64_t)a | ((uint64_t)d << 32);
> +	return exception_vector();
> +}
> +
>   static inline uint64_t rdpmc(uint32_t index)
>   {
> -	uint32_t a, d;
> -	asm volatile ("rdpmc" : "=a"(a), "=d"(d) : "c"(index));
> -	return a | ((uint64_t)d << 32);
> +	uint64_t val;
> +	int vector = rdpmc_safe(index, &val);
> +
> +	assert_msg(!vector, "Unexpected %s on RDPMC(%d)",
> +		   exception_mnemonic(vector), index);
> +	return val;
>   }

Seems like this is causing the CI to fail:

https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3339274319#L1260

I guess you have to use PRId32 here? Could you please send a patch?

  Thanks,
   Thomas



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [kvm-unit-tests PATCH v5 11/27] x86/pmu: Update rdpmc testcase to cover #GP path
  2022-11-24 11:33   ` Thomas Huth
@ 2022-11-24 11:52     ` Like Xu
  0 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2022-11-24 11:52 UTC (permalink / raw)
  To: Thomas Huth; +Cc: kvm, Sandipan Das, Sean Christopherson, Paolo Bonzini

On 24/11/2022 7:33 pm, Thomas Huth wrote:
> On 02/11/2022 23.50, Sean Christopherson wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> Specifying an unsupported PMC encoding will cause a #GP(0).
>>
>> There are multiple reasons RDPMC can #GP, the one that is being relied
>> on to guarantee #GP is specifically that the PMC is invalid. The most
>> extensible solution is to provide a safe variant.
>>
>> Suggested-by: Sean Christopherson <seanjc@google.com>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>> ---
>>   lib/x86/processor.h | 21 ++++++++++++++++++---
>>   x86/pmu.c           | 10 ++++++++++
>>   2 files changed, 28 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/x86/processor.h b/lib/x86/processor.h
>> index f85abe36..ba14c7a0 100644
>> --- a/lib/x86/processor.h
>> +++ b/lib/x86/processor.h
>> @@ -438,11 +438,26 @@ static inline int wrmsr_safe(u32 index, u64 val)
>>       return exception_vector();
>>   }
>> +static inline int rdpmc_safe(u32 index, uint64_t *val)
>> +{
>> +    uint32_t a, d;
>> +
>> +    asm volatile (ASM_TRY("1f")
>> +              "rdpmc\n\t"
>> +              "1:"
>> +              : "=a"(a), "=d"(d) : "c"(index) : "memory");
>> +    *val = (uint64_t)a | ((uint64_t)d << 32);
>> +    return exception_vector();
>> +}
>> +
>>   static inline uint64_t rdpmc(uint32_t index)
>>   {
>> -    uint32_t a, d;
>> -    asm volatile ("rdpmc" : "=a"(a), "=d"(d) : "c"(index));
>> -    return a | ((uint64_t)d << 32);
>> +    uint64_t val;
>> +    int vector = rdpmc_safe(index, &val);
>> +
>> +    assert_msg(!vector, "Unexpected %s on RDPMC(%d)",
>> +           exception_mnemonic(vector), index);
>> +    return val;
>>   }
> 
> Seems like this is causing the CI to fail:
> 
> https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3339274319#L1260

I just now noticed that KUT can be used to validate TCG or HVF accel on macOS.
I assume the functionality of the PMU counter on TCG seems to be a blank slate.

> 
> I guess you have to use PRId32 here? Could you please send a patch?
Sure, let me try it on the macOS.

> 
>   Thanks,
>    Thomas
> 
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-11-24 11:53 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-02 22:50 [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 01/27] x86/pmu: Add PDCM check before accessing PERF_CAP register Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 02/27] x86/pmu: Test emulation instructions on full-width counters Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 03/27] x86/pmu: Pop up FW prefix to avoid out-of-context propagation Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 04/27] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 05/27] x86/pmu: Fix printed messages for emulated instruction test Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 06/27] x86/pmu: Introduce __start_event() to drop all of the manual zeroing Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 07/27] x86/pmu: Introduce multiple_{one, many}() to improve readability Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 08/27] x86/pmu: Reset the expected count of the fixed counter 0 when i386 Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 09/27] x86: create pmu group for quick pmu-scope testing Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 10/27] x86/pmu: Refine info to clarify the current support Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 11/27] x86/pmu: Update rdpmc testcase to cover #GP path Sean Christopherson
2022-11-24 11:33   ` Thomas Huth
2022-11-24 11:52     ` Like Xu
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 12/27] x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 13/27] x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 14/27] x86: Add a helper for the BSP's final init sequence common to all flavors Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 15/27] x86/pmu: Snapshot PMU perf_capabilities during BSP initialization Sean Christopherson
2022-11-02 22:50 ` [kvm-unit-tests PATCH v5 16/27] x86/pmu: Snapshot CPUID.0xA PMU capabilities " Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 17/27] x86/pmu: Drop wrappers that just passthrough pmu_caps fields Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 18/27] x86/pmu: Track GP counter and event select base MSRs in pmu_caps Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 19/27] x86/pmu: Add helper to get fixed counter MSR index Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 20/27] x86/pmu: Reset GP and Fixed counters during pmu_init() Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 21/27] x86/pmu: Track global status/control/clear MSRs in pmu_caps Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 22/27] x86: Add tests for Guest Processor Event Based Sampling (PEBS) Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 23/27] x86/pmu: Add global helpers to cover Intel Arch PMU Version 1 Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 24/27] x86/pmu: Add gp_events pointer to route different event tables Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 25/27] x86/pmu: Add pmu_caps flag to track if CPU is Intel (versus AMD) Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 26/27] x86/pmu: Update testcases to cover AMD PMU Sean Christopherson
2022-11-08  9:53   ` Paolo Bonzini
2022-11-09  0:52     ` Sean Christopherson
2022-11-02 22:51 ` [kvm-unit-tests PATCH v5 27/27] x86/pmu: Add AMD Guest PerfMonV2 testcases Sean Christopherson
2022-11-07  7:02 ` [kvm-unit-tests PATCH v5 00/27] x86/pmu: Test case optimization, fixes and additions Like Xu
2022-11-07 17:50 ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.