[kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions
@ 2022-08-19 11:09 Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 01/13] x86/pmu: Introduce __start_event() to drop all of the manual zeroing Like Xu
                   ` (12 more replies)
  0 siblings, 13 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

v2 -> v3 Changelog:
- Add new testcases to cover Intel Arch PMU Version 1;
- Add new testcases to cover AMD PMU (Before Zen4);
- Fix multiple_many() in the patch 2/5;
- Move check_emulated_instr() into check_counters();
- Fix some report_prefix issues;
- Add PDCM check before accessing PERF_CAP register;
- Fix testing Intel LBR on AMD;

v2: https://lore.kernel.org/kvm/20220816080909.90622-1-likexu@tencent.com/
v1 -> v2 Changelog:
- Introduce __start_event() and multiple_many() for readability; (Sean)
- Move PEBS testcases to this patch set for easier tracking;
- Create vPMU testcase group as more related tests are coming;

Like Xu (13):
  x86/pmu: Introduce __start_event() to drop all of the manual zeroing
  x86/pmu: Introduce multiple_{one, many}() to improve readability
  x86/pmu: Reset the expected count of the fixed counter 0 when i386
  x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS)
  x86: create pmu group for quick pmu-scope testing
  x86/pmu: Test emulation instructions on full-width counters
  x86/pmu: Pop up FW prefix to avoid out-of-context propagation
  x86/pmu: Add PDCM check before accessing PERF_CAP register.
  x86/pmu: Report SKIP when testing Intel LBR on AMD platforms
  x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  x86/pmu: Report SKIP when testing PMU on AMD platforms
  x86/pmu: Add assignment framework for Intel-specific HW resources
  x86/pmu: Update testcases to cover AMD PMU

 lib/x86/msr.h       |  18 ++
 lib/x86/processor.h |  32 ++-
 x86/Makefile.x86_64 |   1 +
 x86/pmu.c           | 304 ++++++++++++++++++---------
 x86/pmu_lbr.c       |   2 +-
 x86/pmu_pebs.c      | 486 ++++++++++++++++++++++++++++++++++++++++++++
 x86/unittests.cfg   |  10 +
 7 files changed, 758 insertions(+), 95 deletions(-)
 create mode 100644 x86/pmu_pebs.c

-- 
2.37.2

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 01/13] x86/pmu: Introduce __start_event() to drop all of the manual zeroing
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 02/13] x86/pmu: Introduce multiple_{one, many}() to improve readability Like Xu
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

Most invocation of start_event() and measure() first sets evt.count=0.
Instead of forcing each caller to ensure count is zeroed, optimize the
count to zero during start_event(), then drop all of the manual zeroing.

Accumulating counts can be handled by reading the current count before
start_event(), and doing something like stuffing a high count to test an
edge case could be handled by an inner helper, __start_event().

For overflow, just open code measure() for that one-off case. Requiring
callers to zero out a field in most common cases isn't exactly flexible.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index d59baf1..817b4d0 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -137,9 +137,9 @@ static void global_disable(pmu_counter_t *cnt)
 			~(1ull << cnt->idx));
 }
 
-
-static void start_event(pmu_counter_t *evt)
+static void __start_event(pmu_counter_t *evt, uint64_t count)
 {
+    evt->count = count;
     wrmsr(evt->ctr, evt->count);
     if (is_gp(evt))
 	    wrmsr(MSR_P6_EVNTSEL0 + event_to_global_idx(evt),
@@ -162,6 +162,11 @@ static void start_event(pmu_counter_t *evt)
     apic_write(APIC_LVTPC, PC_VECTOR);
 }
 
+static void start_event(pmu_counter_t *evt)
+{
+	__start_event(evt, 0);
+}
+
 static void stop_event(pmu_counter_t *evt)
 {
 	global_disable(evt);
@@ -186,6 +191,13 @@ static void measure(pmu_counter_t *evt, int count)
 		stop_event(&evt[i]);
 }
 
+static void __measure(pmu_counter_t *evt, uint64_t count)
+{
+	__start_event(evt, count);
+	loop();
+	stop_event(evt);
+}
+
 static bool verify_event(uint64_t count, struct pmu_event *e)
 {
 	// printf("%d <= %ld <= %d\n", e->min, count, e->max);
@@ -208,7 +220,6 @@ static void check_gp_counter(struct pmu_event *evt)
 	int i;
 
 	for (i = 0; i < nr_gp_counters; i++, cnt.ctr++) {
-		cnt.count = 0;
 		measure(&cnt, 1);
 		report(verify_event(cnt.count, evt), "%s-%d", evt->name, i);
 	}
@@ -235,7 +246,6 @@ static void check_fixed_counters(void)
 	int i;
 
 	for (i = 0; i < nr_fixed_counters; i++) {
-		cnt.count = 0;
 		cnt.ctr = fixed_events[i].unit_sel;
 		measure(&cnt, 1);
 		report(verify_event(cnt.count, &fixed_events[i]), "fixed-%d", i);
@@ -253,14 +263,12 @@ static void check_counters_many(void)
 		if (!pmu_gp_counter_is_available(i))
 			continue;
 
-		cnt[n].count = 0;
 		cnt[n].ctr = gp_counter_base + n;
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR |
 			gp_events[i % ARRAY_SIZE(gp_events)].unit_sel;
 		n++;
 	}
 	for (i = 0; i < nr_fixed_counters; i++) {
-		cnt[n].count = 0;
 		cnt[n].ctr = fixed_events[i].unit_sel;
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR;
 		n++;
@@ -283,9 +291,8 @@ static void check_counter_overflow(void)
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
-		.count = 0,
 	};
-	measure(&cnt, 1);
+	__measure(&cnt, 0);
 	count = cnt.count;
 
 	/* clear status before test */
@@ -311,7 +318,7 @@ static void check_counter_overflow(void)
 		else
 			cnt.config &= ~EVNTSEL_INT;
 		idx = event_to_global_idx(&cnt);
-		measure(&cnt, 1);
+		__measure(&cnt, cnt.count);
 		report(cnt.count == 1, "cntr-%d", i);
 		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
 		report(status & (1ull << idx), "status-%d", i);
@@ -329,7 +336,6 @@ static void check_gp_counter_cmask(void)
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
-		.count = 0,
 	};
 	cnt.config |= (0x2 << EVNTSEL_CMASK_SHIFT);
 	measure(&cnt, 1);
@@ -415,7 +421,6 @@ static void check_running_counter_wrmsr(void)
 	pmu_counter_t evt = {
 		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
-		.count = 0,
 	};
 
 	report_prefix_push("running counter wrmsr");
@@ -430,7 +435,6 @@ static void check_running_counter_wrmsr(void)
 	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
 	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
 
-	evt.count = 0;
 	start_event(&evt);
 
 	count = -1;
@@ -454,13 +458,11 @@ static void check_emulated_instr(void)
 		.ctr = MSR_IA32_PERFCTR0,
 		/* branch instructions */
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[5].unit_sel,
-		.count = 0,
 	};
 	pmu_counter_t instr_cnt = {
 		.ctr = MSR_IA32_PERFCTR0 + 1,
 		/* instructions */
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
-		.count = 0,
 	};
 	report_prefix_push("emulated instruction");
 
@@ -589,7 +591,6 @@ static void set_ref_cycle_expectations(void)
 	pmu_counter_t cnt = {
 		.ctr = MSR_IA32_PERFCTR0,
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[2].unit_sel,
-		.count = 0,
 	};
 	uint64_t tsc_delta;
 	uint64_t t0, t1, t2, t3;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 02/13] x86/pmu: Introduce multiple_{one, many}() to improve readability
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 01/13] x86/pmu: Introduce __start_event() to drop all of the manual zeroing Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386 Like Xu
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

The current measure_one() forces the common case to pass in unnecessary
information in order to give flexibility to a single use case. It's just
syntatic sugar, but it really does help readers as it's not obvious that
the "1" specifies the number of events, whereas multiple_many() and
measure_one() are relatively self-explanatory.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 817b4d0..45ca2c6 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -181,7 +181,7 @@ static void stop_event(pmu_counter_t *evt)
 	evt->count = rdmsr(evt->ctr);
 }
 
-static void measure(pmu_counter_t *evt, int count)
+static void measure_many(pmu_counter_t *evt, int count)
 {
 	int i;
 	for (i = 0; i < count; i++)
@@ -191,6 +191,11 @@ static void measure(pmu_counter_t *evt, int count)
 		stop_event(&evt[i]);
 }
 
+static void measure_one(pmu_counter_t *evt)
+{
+	measure_many(evt, 1);
+}
+
 static void __measure(pmu_counter_t *evt, uint64_t count)
 {
 	__start_event(evt, count);
@@ -220,7 +225,7 @@ static void check_gp_counter(struct pmu_event *evt)
 	int i;
 
 	for (i = 0; i < nr_gp_counters; i++, cnt.ctr++) {
-		measure(&cnt, 1);
+		measure_one(&cnt);
 		report(verify_event(cnt.count, evt), "%s-%d", evt->name, i);
 	}
 }
@@ -247,7 +252,7 @@ static void check_fixed_counters(void)
 
 	for (i = 0; i < nr_fixed_counters; i++) {
 		cnt.ctr = fixed_events[i].unit_sel;
-		measure(&cnt, 1);
+		measure_one(&cnt);
 		report(verify_event(cnt.count, &fixed_events[i]), "fixed-%d", i);
 	}
 }
@@ -274,7 +279,7 @@ static void check_counters_many(void)
 		n++;
 	}
 
-	measure(cnt, n);
+	measure_many(cnt, n);
 
 	for (i = 0; i < n; i++)
 		if (!verify_counter(&cnt[i]))
@@ -338,7 +343,7 @@ static void check_gp_counter_cmask(void)
 		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
 	};
 	cnt.config |= (0x2 << EVNTSEL_CMASK_SHIFT);
-	measure(&cnt, 1);
+	measure_one(&cnt);
 	report(cnt.count < gp_events[1].min, "cmask");
 }
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 01/13] x86/pmu: Introduce __start_event() to drop all of the manual zeroing Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 02/13] x86/pmu: Introduce multiple_{one, many}() to improve readability Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-10-05 22:18   ` Sean Christopherson
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 04/13] x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS) Like Xu
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

The pmu test check_counter_overflow() always fails with the "./configure
--arch=i386". The cnt.count obtained from the latter run of measure()
(based on fixed counter 0) is not equal to the expected value (based
on gp counter 0) and there is a positive error with a value of 2.

The two extra instructions come from inline wrmsr() and inline rdmsr()
inside the global_disable() binary code block. Specifically, for each msr
access, the i386 code will have two assembly mov instructions before
rdmsr/wrmsr (mark it for fixed counter 0, bit 32), but only one assembly
mov is needed for x86_64 and gp counter 0 on i386.

Fix the expected init cnt.count for fixed counter 0 overflow based on
the same fixed counter 0, not always using gp counter 0.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/x86/pmu.c b/x86/pmu.c
index 45ca2c6..057fd4a 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -315,6 +315,9 @@ static void check_counter_overflow(void)
 
 		if (i == nr_gp_counters) {
 			cnt.ctr = fixed_events[0].unit_sel;
+			__measure(&cnt, 0);
+			count = cnt.count;
+			cnt.count = 1 - count;
 			cnt.count &= (1ull << pmu_fixed_counter_width()) - 1;
 		}
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 04/13] x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS)
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (2 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386 Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-10-05 22:21   ` Sean Christopherson
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 05/13] x86: create pmu group for quick pmu-scope testing Like Xu
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

This unit-test is intended to test the KVM's support for
the Processor Event Based Sampling (PEBS) which is another
PMU feature on Intel processors (start from Ice Lake Server).

If a bit in PEBS_ENABLE is set to 1, its corresponding counter will
write at least one PEBS records (including partial state of the vcpu
at the time of the current hardware event) to the guest memory on
counter overflow, and trigger an interrupt at a specific DS state.
The format of a PEBS record can be configured by another register.

These tests cover most usage scenarios, for example there are some
specially constructed scenarios (not a typical behaviour of Linux
PEBS driver). It lowers the threshold for others to understand this
feature and opens up more exploration of KVM implementation or
hw feature itself.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 lib/x86/msr.h       |   1 +
 x86/Makefile.x86_64 |   1 +
 x86/pmu_pebs.c      | 486 ++++++++++++++++++++++++++++++++++++++++++++
 x86/unittests.cfg   |   7 +
 4 files changed, 495 insertions(+)
 create mode 100644 x86/pmu_pebs.c

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index fa1c0c8..252e041 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -52,6 +52,7 @@
 #define MSR_IA32_MCG_CTL		0x0000017b
 
 #define MSR_IA32_PEBS_ENABLE		0x000003f1
+#define MSR_PEBS_DATA_CFG		0x000003f2
 #define MSR_IA32_DS_AREA		0x00000600
 #define MSR_IA32_PERF_CAPABILITIES	0x00000345
 
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index 8f9463c..bd827fe 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -33,6 +33,7 @@ tests += $(TEST_DIR)/vmware_backdoors.$(exe)
 tests += $(TEST_DIR)/rdpru.$(exe)
 tests += $(TEST_DIR)/pks.$(exe)
 tests += $(TEST_DIR)/pmu_lbr.$(exe)
+tests += $(TEST_DIR)/pmu_pebs.$(exe)
 
 ifeq ($(CONFIG_EFI),y)
 tests += $(TEST_DIR)/amd_sev.$(exe)
diff --git a/x86/pmu_pebs.c b/x86/pmu_pebs.c
new file mode 100644
index 0000000..db4ecbf
--- /dev/null
+++ b/x86/pmu_pebs.c
@@ -0,0 +1,486 @@
+#include "x86/msr.h"
+#include "x86/processor.h"
+#include "x86/isr.h"
+#include "x86/apic.h"
+#include "x86/apic-defs.h"
+#include "x86/desc.h"
+#include "alloc.h"
+
+#include "vm.h"
+#include "types.h"
+#include "processor.h"
+#include "vmalloc.h"
+#include "alloc_page.h"
+
+#define PC_VECTOR	32
+
+#define	X86_FEATURE_PDCM		(CPUID(0x1, 0, ECX, 15))
+
+#define PERF_CAP_PEBS_FORMAT           0xf00
+#define PMU_CAP_FW_WRITES	(1ULL << 13)
+#define PMU_CAP_PEBS_BASELINE	(1ULL << 14)
+
+#define INTEL_PMC_IDX_FIXED				       32
+
+#define GLOBAL_STATUS_BUFFER_OVF_BIT		62
+#define GLOBAL_STATUS_BUFFER_OVF	BIT_ULL(GLOBAL_STATUS_BUFFER_OVF_BIT)
+
+#define EVNTSEL_USR_SHIFT       16
+#define EVNTSEL_OS_SHIFT        17
+#define EVNTSEL_EN_SHIF         22
+
+#define EVNTSEL_EN      (1 << EVNTSEL_EN_SHIF)
+#define EVNTSEL_USR     (1 << EVNTSEL_USR_SHIFT)
+#define EVNTSEL_OS      (1 << EVNTSEL_OS_SHIFT)
+
+#define PEBS_DATACFG_MEMINFO	BIT_ULL(0)
+#define PEBS_DATACFG_GP	BIT_ULL(1)
+#define PEBS_DATACFG_XMMS	BIT_ULL(2)
+#define PEBS_DATACFG_LBRS	BIT_ULL(3)
+
+#define ICL_EVENTSEL_ADAPTIVE				(1ULL << 34)
+#define PEBS_DATACFG_LBR_SHIFT	24
+#define MAX_NUM_LBR_ENTRY	32
+
+static u64 gp_counter_base = MSR_IA32_PERFCTR0;
+static unsigned int max_nr_gp_events;
+static unsigned long *ds_bufer;
+static unsigned long *pebs_buffer;
+static u64 ctr_start_val;
+static u64 perf_cap;
+
+struct debug_store {
+	u64	bts_buffer_base;
+	u64	bts_index;
+	u64	bts_absolute_maximum;
+	u64	bts_interrupt_threshold;
+	u64	pebs_buffer_base;
+	u64	pebs_index;
+	u64	pebs_absolute_maximum;
+	u64	pebs_interrupt_threshold;
+	u64	pebs_event_reset[64];
+};
+
+struct pebs_basic {
+	u64 format_size;
+	u64 ip;
+	u64 applicable_counters;
+	u64 tsc;
+};
+
+struct pebs_meminfo {
+	u64 address;
+	u64 aux;
+	u64 latency;
+	u64 tsx_tuning;
+};
+
+struct pebs_gprs {
+	u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di;
+	u64 r8, r9, r10, r11, r12, r13, r14, r15;
+};
+
+struct pebs_xmm {
+	u64 xmm[16*2];	/* two entries for each register */
+};
+
+struct lbr_entry {
+	u64 from;
+	u64 to;
+	u64 info;
+};
+
+enum pmc_type {
+	GP = 0,
+	FIXED,
+};
+
+static uint32_t intel_arch_events[] = {
+	0x00c4, /* PERF_COUNT_HW_BRANCH_INSTRUCTIONS */
+	0x00c5, /* PERF_COUNT_HW_BRANCH_MISSES */
+	0x0300, /* PERF_COUNT_HW_REF_CPU_CYCLES */
+	0x003c, /* PERF_COUNT_HW_CPU_CYCLES */
+	0x00c0, /* PERF_COUNT_HW_INSTRUCTIONS */
+	0x013c, /* PERF_COUNT_HW_BUS_CYCLES */
+	0x4f2e, /* PERF_COUNT_HW_CACHE_REFERENCES */
+	0x412e, /* PERF_COUNT_HW_CACHE_MISSES */
+};
+
+static u64 pebs_data_cfgs[] = {
+	PEBS_DATACFG_MEMINFO,
+	PEBS_DATACFG_GP,
+	PEBS_DATACFG_XMMS,
+	PEBS_DATACFG_LBRS | ((MAX_NUM_LBR_ENTRY -1) << PEBS_DATACFG_LBR_SHIFT),
+};
+
+/* Iterating each counter value is a waste of time, pick a few typical values. */
+static u64 counter_start_values[] = {
+	/* if PEBS counter doesn't overflow at all */
+	0,
+	0xfffffffffff0,
+	/* normal counter overflow to have PEBS records */
+	0xfffffffffffe,
+	/* test whether emulated instructions should trigger PEBS */
+	0xffffffffffff,
+};
+
+static inline u8 pebs_format(void)
+{
+	return (perf_cap & PERF_CAP_PEBS_FORMAT ) >> 8;
+}
+
+static inline bool pebs_has_baseline(void)
+{
+	return perf_cap & PMU_CAP_PEBS_BASELINE;
+}
+
+static unsigned int get_adaptive_pebs_record_size(u64 pebs_data_cfg)
+{
+	unsigned int sz = sizeof(struct pebs_basic);
+
+	if (!pebs_has_baseline())
+		return sz;
+
+	if (pebs_data_cfg & PEBS_DATACFG_MEMINFO)
+		sz += sizeof(struct pebs_meminfo);
+	if (pebs_data_cfg & PEBS_DATACFG_GP)
+		sz += sizeof(struct pebs_gprs);
+	if (pebs_data_cfg & PEBS_DATACFG_XMMS)
+		sz += sizeof(struct pebs_xmm);
+	if (pebs_data_cfg & PEBS_DATACFG_LBRS)
+		sz += MAX_NUM_LBR_ENTRY * sizeof(struct lbr_entry);
+
+	return sz;
+}
+
+static void cnt_overflow(isr_regs_t *regs)
+{
+	apic_write(APIC_EOI, 0);
+}
+
+static inline void workload(void)
+{
+	asm volatile(
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"jne label2\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"label2:\n"
+		:
+		:
+		: "eax", "ebx", "ecx", "edx");
+}
+
+static inline void workload2(void)
+{
+	asm volatile(
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"mov $0x0, %%eax\n"
+		"cmp $0x0, %%eax\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"jne label3\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"mov $0xa, %%eax\n"
+		"cpuid\n"
+		"label3:\n"
+		:
+		:
+		: "eax", "ebx", "ecx", "edx");
+}
+
+static void alloc_buffers(void)
+{
+	ds_bufer = alloc_page();
+	force_4k_page(ds_bufer);
+	memset(ds_bufer, 0x0, PAGE_SIZE);
+
+	pebs_buffer = alloc_page();
+	force_4k_page(pebs_buffer);
+	memset(pebs_buffer, 0x0, PAGE_SIZE);
+}
+
+static void free_buffers(void)
+{
+	if (ds_bufer)
+		free_page(ds_bufer);
+
+	if (pebs_buffer)
+		free_page(pebs_buffer);
+}
+
+static void pebs_enable(u64 bitmask, u64 pebs_data_cfg)
+{
+	static struct debug_store *ds;
+	u64 baseline_extra_ctrl, fixed_ctr_ctrl = 0;
+	unsigned int idx;
+
+	if (pebs_has_baseline())
+		wrmsr(MSR_PEBS_DATA_CFG, pebs_data_cfg);
+
+	ds = (struct debug_store *)ds_bufer;
+	ds->pebs_index = ds->pebs_buffer_base = (unsigned long)pebs_buffer;
+	ds->pebs_absolute_maximum = (unsigned long)pebs_buffer + PAGE_SIZE;
+	ds->pebs_interrupt_threshold = ds->pebs_buffer_base +
+		get_adaptive_pebs_record_size(pebs_data_cfg);
+
+	for (idx = 0; idx < pmu_nr_fixed_counters(); idx++) {
+		if (!(BIT_ULL(INTEL_PMC_IDX_FIXED + idx) & bitmask))
+			continue;
+		baseline_extra_ctrl = pebs_has_baseline() ?
+			(1ULL << (INTEL_PMC_IDX_FIXED + idx * 4)) : 0;
+		wrmsr(MSR_CORE_PERF_FIXED_CTR0 + idx, ctr_start_val);
+		fixed_ctr_ctrl |= (0xbULL << (idx * 4) | baseline_extra_ctrl);
+	}
+	if (fixed_ctr_ctrl)
+		wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, fixed_ctr_ctrl);
+
+	for (idx = 0; idx < max_nr_gp_events; idx++) {
+		if (!(BIT_ULL(idx) & bitmask))
+			continue;
+		baseline_extra_ctrl = pebs_has_baseline() ?
+			ICL_EVENTSEL_ADAPTIVE : 0;
+		wrmsr(MSR_P6_EVNTSEL0 + idx,
+		      EVNTSEL_EN | EVNTSEL_OS | EVNTSEL_USR |
+		      intel_arch_events[idx] | baseline_extra_ctrl);
+		wrmsr(gp_counter_base + idx, ctr_start_val);
+	}
+
+	wrmsr(MSR_IA32_DS_AREA,  (unsigned long)ds_bufer);
+	wrmsr(MSR_IA32_PEBS_ENABLE, bitmask);
+	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, bitmask);
+}
+
+static void pmu_env_cleanup(void)
+{
+	unsigned int idx;
+
+	memset(ds_bufer, 0x0, PAGE_SIZE);
+	memset(pebs_buffer, 0x0, PAGE_SIZE);
+	wrmsr(MSR_IA32_PEBS_ENABLE, 0);
+	wrmsr(MSR_IA32_DS_AREA,  0);
+	if (pebs_has_baseline())
+		wrmsr(MSR_PEBS_DATA_CFG, 0);
+
+	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+
+	wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
+	for (idx = 0; idx < pmu_nr_fixed_counters(); idx++) {
+		wrmsr(MSR_CORE_PERF_FIXED_CTR0 + idx, 0);
+	}
+
+	for (idx = 0; idx < pmu_nr_gp_counters(); idx++) {
+		wrmsr(MSR_P6_EVNTSEL0 + idx, 0);
+		wrmsr(MSR_IA32_PERFCTR0 + idx, 0);
+	}
+
+	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+}
+
+static inline void pebs_disable_1(void)
+{
+	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+}
+
+static inline void pebs_disable_2(void)
+{
+	wrmsr(MSR_IA32_PEBS_ENABLE, 0);
+	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+}
+
+static void pebs_disable(unsigned int idx)
+{
+	if (idx % 2) {
+		pebs_disable_1();
+	} else {
+		pebs_disable_2();
+	}
+}
+
+static void check_pebs_records(u64 bitmask, u64 pebs_data_cfg)
+{
+	struct pebs_basic *pebs_rec = (struct pebs_basic *)pebs_buffer;
+	struct debug_store *ds = (struct debug_store *)ds_bufer;
+	unsigned int pebs_record_size = get_adaptive_pebs_record_size(pebs_data_cfg);
+	unsigned int count = 0;
+	bool expected, pebs_idx_match, pebs_size_match, data_cfg_match;
+	void *vernier;
+
+	expected = (ds->pebs_index == ds->pebs_buffer_base) && !pebs_rec->format_size;
+	if (!(rdmsr(MSR_CORE_PERF_GLOBAL_STATUS) & GLOBAL_STATUS_BUFFER_OVF)) {
+		report(expected, "No OVF irq, none PEBS records.");
+		return;
+	}
+
+	if (expected) {
+		report(!expected, "A OVF irq, but none PEBS records.");
+		return;
+	}
+
+	expected = ds->pebs_index >= ds->pebs_interrupt_threshold;
+	vernier = (void *)pebs_buffer;
+	do {
+		pebs_rec = (struct pebs_basic *)vernier;
+		pebs_record_size = pebs_rec->format_size >> 48;
+		pebs_idx_match =
+			pebs_rec->applicable_counters & bitmask;
+		pebs_size_match =
+			pebs_record_size == get_adaptive_pebs_record_size(pebs_data_cfg);
+		data_cfg_match =
+			(pebs_rec->format_size & 0xffffffffffff) == pebs_data_cfg;
+		expected = pebs_idx_match && pebs_size_match && data_cfg_match;
+		report(expected,
+		       "PEBS record (written seq %d) is verified (inclduing size, counters and cfg).", count);
+		vernier = vernier + pebs_record_size;
+		count++;
+	} while (expected && (void *)vernier < (void *)ds->pebs_index);
+
+	if (!expected) {
+		if (!pebs_idx_match)
+			printf("FAIL: The applicable_counters (0x%lx) doesn't match with pmc_bitmask (0x%lx).\n",
+			       pebs_rec->applicable_counters, bitmask);
+		if (!pebs_size_match)
+			printf("FAIL: The pebs_record_size (%d) doesn't match with MSR_PEBS_DATA_CFG (%d).\n",
+			       pebs_record_size, get_adaptive_pebs_record_size(pebs_data_cfg));
+		if (!data_cfg_match)
+			printf("FAIL: The pebs_data_cfg (0x%lx) doesn't match with MSR_PEBS_DATA_CFG (0x%lx).\n",
+			       pebs_rec->format_size & 0xffffffffffff, pebs_data_cfg);
+	}
+}
+
+static void check_one_counter(enum pmc_type type,
+			      unsigned int idx, u64 pebs_data_cfg)
+{
+	report_prefix_pushf("%s counter %d (0x%lx)",
+			    type == FIXED ? "Extended Fixed" : "GP", idx, ctr_start_val);
+	pmu_env_cleanup();
+	pebs_enable(BIT_ULL(type == FIXED ? INTEL_PMC_IDX_FIXED + idx : idx), pebs_data_cfg);
+	workload();
+	pebs_disable(idx);
+	check_pebs_records(BIT_ULL(type == FIXED ? INTEL_PMC_IDX_FIXED + idx : idx), pebs_data_cfg);
+	report_prefix_pop();
+}
+
+static void check_multiple_counters(u64 bitmask, u64 pebs_data_cfg)
+{
+	pmu_env_cleanup();
+	pebs_enable(bitmask, pebs_data_cfg);
+	workload2();
+	pebs_disable(0);
+	check_pebs_records(bitmask, pebs_data_cfg);
+}
+
+static void check_pebs_counters(u64 pebs_data_cfg)
+{
+	unsigned int idx;
+	u64 bitmask = 0;
+
+	for (idx = 0; idx < pmu_nr_fixed_counters(); idx++)
+		check_one_counter(FIXED, idx, pebs_data_cfg);
+
+	for (idx = 0; idx < max_nr_gp_events; idx++)
+		check_one_counter(GP, idx, pebs_data_cfg);
+
+	for (idx = 0; idx < pmu_nr_fixed_counters(); idx++)
+		bitmask |= BIT_ULL(INTEL_PMC_IDX_FIXED + idx);
+	for (idx = 0; idx < max_nr_gp_events; idx += 2)
+		bitmask |= BIT_ULL(idx);
+	report_prefix_pushf("Multiple (0x%lx)", bitmask);
+	check_multiple_counters(bitmask, pebs_data_cfg);
+	report_prefix_pop();
+}
+
+int main(int ac, char **av)
+{
+	unsigned int i, j;
+
+	setup_vm();
+
+	max_nr_gp_events = MIN(pmu_nr_gp_counters(), ARRAY_SIZE(intel_arch_events));
+
+	printf("PMU version: %d\n", pmu_version());
+	if (this_cpu_has(X86_FEATURE_PDCM))
+		perf_cap = rdmsr(MSR_IA32_PERF_CAPABILITIES);
+
+	if (perf_cap & PMU_CAP_FW_WRITES)
+		gp_counter_base = MSR_IA32_PMC0;
+
+	if (!is_intel()) {
+		report_skip("PEBS is only supported on Intel CPUs (ICX or later)");
+		return report_summary();
+	} else if (pmu_version() < 2) {
+		report_skip("Architectural PMU version is not high enough");
+		return report_summary();
+	} else if (!pebs_format()) {
+		report_skip("PEBS not enumerated in PERF_CAPABILITIES");
+		return report_summary();
+	} else if (rdmsr(MSR_IA32_MISC_ENABLE) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL) {
+		report_skip("PEBS unavailable according to MISC_ENABLE");
+		return report_summary();
+	}
+
+	printf("PEBS format: %d\n", pebs_format());
+	printf("PEBS GP counters: %d\n", pmu_nr_gp_counters());
+	printf("PEBS Fixed counters: %d\n", pmu_nr_fixed_counters());
+	printf("PEBS baseline (Adaptive PEBS): %d\n", pebs_has_baseline());
+
+	printf("Known reasons for none PEBS records:\n");
+	printf("1. The selected event does not support PEBS;\n");
+	printf("2. From a core pmu perspective, the vCPU and pCPU models are not same;\n");
+	printf("3. Guest counter has not yet overflowed or been cross-mapped by the host;\n");
+
+	handle_irq(PC_VECTOR, cnt_overflow);
+	alloc_buffers();
+
+	for (i = 0; i < ARRAY_SIZE(counter_start_values); i++) {
+		ctr_start_val = counter_start_values[i];
+		check_pebs_counters(0);
+		if (!pebs_has_baseline())
+			continue;
+
+		for (j = 0; j < ARRAY_SIZE(pebs_data_cfgs); j++) {
+			report_prefix_pushf("Adaptive (0x%lx)", pebs_data_cfgs[j]);
+			check_pebs_counters(pebs_data_cfgs[j]);
+			report_prefix_pop();
+		}
+	}
+
+	free_buffers();
+
+	return report_summary();
+}
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index ed65185..c5efb25 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -198,6 +198,13 @@ check = /sys/module/kvm/parameters/ignore_msrs=N
 check = /proc/sys/kernel/nmi_watchdog=0
 accel = kvm
 
+[pmu_pebs]
+arch = x86_64
+file = pmu_pebs.flat
+extra_params = -cpu host,migratable=no
+check = /proc/sys/kernel/nmi_watchdog=0
+accel = kvm
+
 [vmware_backdoors]
 file = vmware_backdoors.flat
 extra_params = -machine vmport=on -cpu max
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 05/13] x86: create pmu group for quick pmu-scope testing
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (3 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 04/13] x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS) Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-10-05 22:23   ` Sean Christopherson
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 06/13] x86/pmu: Test emulation instructions on full-width counters Like Xu
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

Any agent can run "./run_tests.sh -g pmu" for vPMU-related testcases.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/unittests.cfg | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index c5efb25..54f0437 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -189,6 +189,7 @@ file = pmu.flat
 extra_params = -cpu max
 check = /proc/sys/kernel/nmi_watchdog=0
 accel = kvm
+groups = pmu
 
 [pmu_lbr]
 arch = x86_64
@@ -197,6 +198,7 @@ extra_params = -cpu host,migratable=no
 check = /sys/module/kvm/parameters/ignore_msrs=N
 check = /proc/sys/kernel/nmi_watchdog=0
 accel = kvm
+groups = pmu
 
 [pmu_pebs]
 arch = x86_64
@@ -204,6 +206,7 @@ file = pmu_pebs.flat
 extra_params = -cpu host,migratable=no
 check = /proc/sys/kernel/nmi_watchdog=0
 accel = kvm
+groups = pmu
 
 [vmware_backdoors]
 file = vmware_backdoors.flat
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 06/13] x86/pmu: Test emulation instructions on full-width counters
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (4 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 05/13] x86: create pmu group for quick pmu-scope testing Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 07/13] x86/pmu: Pop up FW prefix to avoid out-of-context propagation Like Xu
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

Move check_emulated_instr() into check_counters() so that full-width
counters could be tested with ease by the same test case.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 057fd4a..9262f3c 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -530,6 +530,9 @@ static void check_emulated_instr(void)
 
 static void check_counters(void)
 {
+	if (is_fep_available())
+		check_emulated_instr();
+
 	check_gp_counters();
 	check_fixed_counters();
 	check_rdpmc();
@@ -664,9 +667,6 @@ int main(int ac, char **av)
 
 	apic_write(APIC_LVTPC, PC_VECTOR);
 
-	if (is_fep_available())
-		check_emulated_instr();
-
 	check_counters();
 
 	if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES) {
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 07/13] x86/pmu: Pop up FW prefix to avoid out-of-context propagation
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (5 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 06/13] x86/pmu: Test emulation instructions on full-width counters Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-10-05 22:25   ` Sean Christopherson
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 08/13] x86/pmu: Add PDCM check before accessing PERF_CAP register Like Xu
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

The inappropriate prefix may be propagated to later test cases if any.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/x86/pmu.c b/x86/pmu.c
index 9262f3c..4eb92d8 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -674,6 +674,7 @@ int main(int ac, char **av)
 		report_prefix_push("full-width writes");
 		check_counters();
 		check_gp_counters_write_width();
+		report_prefix_pop();
 	}
 
 	return report_summary();
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 08/13] x86/pmu: Add PDCM check before accessing PERF_CAP register
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (6 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 07/13] x86/pmu: Pop up FW prefix to avoid out-of-context propagation Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-10-05 22:28   ` Sean Christopherson
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 09/13] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms Like Xu
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

On virtual platforms without PDCM support (e.g. AMD), #GP
failure on MSR_IA32_PERF_CAPABILITIES is completely avoidable.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 4eb92d8..25fafbe 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -669,7 +669,8 @@ int main(int ac, char **av)
 
 	check_counters();
 
-	if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES) {
+	if (this_cpu_has(X86_FEATURE_PDCM) &&
+	    (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)) {
 		gp_counter_base = MSR_IA32_PMC0;
 		report_prefix_push("full-width writes");
 		check_counters();
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 09/13] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (7 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 08/13] x86/pmu: Add PDCM check before accessing PERF_CAP register Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-10-05 22:29   ` Sean Christopherson
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1 Like Xu
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

The test conclusion of running Intel LBR on AMD platforms
should not be PASS, but SKIP, fix it.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu_lbr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86/pmu_lbr.c b/x86/pmu_lbr.c
index 8dad1f1..4a9e24d 100644
--- a/x86/pmu_lbr.c
+++ b/x86/pmu_lbr.c
@@ -59,7 +59,7 @@ int main(int ac, char **av)
 
 	if (!is_intel()) {
 		report_skip("PMU_LBR test is for intel CPU's only");
-		return 0;
+		return report_summary();
 	}
 
 	if (!this_cpu_has_pmu()) {
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (8 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 09/13] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-09-06  7:15   ` Sandipan Das
                     ` (2 more replies)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 11/13] x86/pmu: Refine message when testing PMU on AMD platforms Like Xu
                   ` (2 subsequent siblings)
  12 siblings, 3 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

For most unit tests, the basic framework and use cases which test
any PMU counter do not require any changes, except for two things:

- No access to registers introduced only in PMU version 2 and above;
- Expanded tolerance for testing counter overflows
  due to the loss of uniform control of the gloabl_ctrl register

Adding some pmu_version() return value checks can seamlessly support
Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 43 insertions(+), 21 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 25fafbe..826472c 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -125,14 +125,19 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
 
 static void global_enable(pmu_counter_t *cnt)
 {
-	cnt->idx = event_to_global_idx(cnt);
+	if (pmu_version() < 2)
+		return;
 
+	cnt->idx = event_to_global_idx(cnt);
 	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) |
 			(1ull << cnt->idx));
 }
 
 static void global_disable(pmu_counter_t *cnt)
 {
+	if (pmu_version() < 2)
+		return;
+
 	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) &
 			~(1ull << cnt->idx));
 }
@@ -301,7 +306,10 @@ static void check_counter_overflow(void)
 	count = cnt.count;
 
 	/* clear status before test */
-	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	if (pmu_version() > 1) {
+		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
+		      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	}
 
 	report_prefix_push("overflow");
 
@@ -327,13 +335,21 @@ static void check_counter_overflow(void)
 			cnt.config &= ~EVNTSEL_INT;
 		idx = event_to_global_idx(&cnt);
 		__measure(&cnt, cnt.count);
-		report(cnt.count == 1, "cntr-%d", i);
+
+		report(check_irq() == (i % 2), "irq-%d", i);
+		if (pmu_version() > 1)
+			report(cnt.count == 1, "cntr-%d", i);
+		else
+			report(cnt.count < 4, "cntr-%d", i);
+
+		if (pmu_version() < 2)
+			continue;
+
 		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
 		report(status & (1ull << idx), "status-%d", i);
 		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, status);
 		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
 		report(!(status & (1ull << idx)), "status clear-%d", i);
-		report(check_irq() == (i % 2), "irq-%d", i);
 	}
 
 	report_prefix_pop();
@@ -440,8 +456,10 @@ static void check_running_counter_wrmsr(void)
 	report(evt.count < gp_events[1].min, "cntr");
 
 	/* clear status before overflow test */
-	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
-	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	if (pmu_version() > 1) {
+		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
+			rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	}
 
 	start_event(&evt);
 
@@ -453,8 +471,11 @@ static void check_running_counter_wrmsr(void)
 
 	loop();
 	stop_event(&evt);
-	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
-	report(status & 1, "status");
+
+	if (pmu_version() > 1) {
+		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
+		report(status & 1, "status");
+	}
 
 	report_prefix_pop();
 }
@@ -474,8 +495,10 @@ static void check_emulated_instr(void)
 	};
 	report_prefix_push("emulated instruction");
 
-	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
-	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	if (pmu_version() > 1) {
+		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
+			rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
+	}
 
 	start_event(&brnch_cnt);
 	start_event(&instr_cnt);
@@ -509,7 +532,8 @@ static void check_emulated_instr(void)
 		:
 		: "eax", "ebx", "ecx", "edx");
 
-	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+	if (pmu_version() > 1)
+		wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
 
 	stop_event(&brnch_cnt);
 	stop_event(&instr_cnt);
@@ -520,10 +544,13 @@ static void check_emulated_instr(void)
 	       "instruction count");
 	report(brnch_cnt.count - brnch_start >= EXPECTED_BRNCH,
 	       "branch count");
-	// Additionally check that those counters overflowed properly.
-	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
-	report(status & 1, "instruction counter overflow");
-	report(status & 2, "branch counter overflow");
+
+	if (pmu_version() > 1) {
+		// Additionally check that those counters overflowed properly.
+		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
+		report(status & 1, "instruction counter overflow");
+		report(status & 2, "branch counter overflow");
+	}
 
 	report_prefix_pop();
 }
@@ -647,12 +674,7 @@ int main(int ac, char **av)
 	buf = malloc(N*64);
 
 	if (!pmu_version()) {
-		report_skip("No pmu is detected!");
-		return report_summary();
-	}
-
-	if (pmu_version() == 1) {
-		report_skip("PMU version 1 is not supported.");
+		report_skip("No Intel Arch PMU is detected!");
 		return report_summary();
 	}
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 11/13] x86/pmu: Refine message when testing PMU on AMD platforms
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (9 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1 Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources Like Xu
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU Like Xu
  12 siblings, 0 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

Add an Intel PMU detection step with a vendor prefix,
and report SKIP naturally on unsupported AMD platforms.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 826472c..b22f255 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -667,16 +667,35 @@ static void set_ref_cycle_expectations(void)
 	gp_events[2].max = (gp_events[2].max * cnt.count) / tsc_delta;
 }
 
+static bool detect_intel_pmu(void)
+{
+	if (!pmu_version()) {
+		report_skip("No Intel Arch PMU is detected!");
+		return false;
+	}
+
+	report_prefix_push("Intel");
+	return true;
+}
+
+static bool pmu_is_detected(void)
+{
+	if (!is_intel()) {
+		report_skip("AMD PMU is not supported.");
+		return false;
+	}
+
+	return detect_intel_pmu();
+}
+
 int main(int ac, char **av)
 {
 	setup_vm();
 	handle_irq(PC_VECTOR, cnt_overflow);
 	buf = malloc(N*64);
 
-	if (!pmu_version()) {
-		report_skip("No Intel Arch PMU is detected!");
+	if (!pmu_is_detected())
 		return report_summary();
-	}
 
 	set_ref_cycle_expectations();
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (10 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 11/13] x86/pmu: Refine message when testing PMU on AMD platforms Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-09-06  7:19   ` Sandipan Das
  2022-10-05 22:44   ` Sean Christopherson
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU Like Xu
  12 siblings, 2 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

This is a pre-requisite operation for adding AMD PMU tests.

AMD and Intel PMU are different in counter registers, event selection
registers, number of generic registers and generic hardware events.
By introducing a set of global variables to facilitate assigning
different values on different platforms.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 x86/pmu.c | 99 ++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 58 insertions(+), 41 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index b22f255..0706cb1 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -50,7 +50,7 @@ struct pmu_event {
 	uint32_t unit_sel;
 	int min;
 	int max;
-} gp_events[] = {
+} intel_gp_events[] = {
 	{"core cycles", 0x003c, 1*N, 50*N},
 	{"instructions", 0x00c0, 10*N, 10.2*N},
 	{"ref cycles", 0x013c, 1*N, 30*N},
@@ -65,7 +65,13 @@ struct pmu_event {
 };
 
 #define PMU_CAP_FW_WRITES	(1ULL << 13)
-static u64 gp_counter_base = MSR_IA32_PERFCTR0;
+static u32 gp_counter_base;
+static u32 gp_select_base;
+static unsigned int gp_events_size;
+static unsigned int nr_gp_counters;
+
+typedef struct pmu_event PMU_EVENTS_ARRAY_t[];
+static PMU_EVENTS_ARRAY_t *gp_events = NULL;
 
 char *buf;
 
@@ -114,9 +120,9 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
 	if (is_gp(cnt)) {
 		int i;
 
-		for (i = 0; i < sizeof(gp_events)/sizeof(gp_events[0]); i++)
-			if (gp_events[i].unit_sel == (cnt->config & 0xffff))
-				return &gp_events[i];
+		for (i = 0; i < gp_events_size; i++)
+			if ((*gp_events)[i].unit_sel == (cnt->config & 0xffff))
+				return &(*gp_events)[i];
 	} else
 		return &fixed_events[cnt->ctr - MSR_CORE_PERF_FIXED_CTR0];
 
@@ -142,12 +148,22 @@ static void global_disable(pmu_counter_t *cnt)
 			~(1ull << cnt->idx));
 }
 
+static inline uint32_t get_gp_counter_msr(unsigned int i)
+{
+	return gp_counter_base + i;
+}
+
+static inline uint32_t get_gp_select_msr(unsigned int i)
+{
+	return gp_select_base + i;
+}
+
 static void __start_event(pmu_counter_t *evt, uint64_t count)
 {
     evt->count = count;
     wrmsr(evt->ctr, evt->count);
     if (is_gp(evt))
-	    wrmsr(MSR_P6_EVNTSEL0 + event_to_global_idx(evt),
+	    wrmsr(get_gp_select_msr(event_to_global_idx(evt)),
 			    evt->config | EVNTSEL_EN);
     else {
 	    uint32_t ctrl = rdmsr(MSR_CORE_PERF_FIXED_CTR_CTRL);
@@ -176,7 +192,7 @@ static void stop_event(pmu_counter_t *evt)
 {
 	global_disable(evt);
 	if (is_gp(evt))
-		wrmsr(MSR_P6_EVNTSEL0 + event_to_global_idx(evt),
+		wrmsr(get_gp_select_msr(event_to_global_idx(evt)),
 				evt->config & ~EVNTSEL_EN);
 	else {
 		uint32_t ctrl = rdmsr(MSR_CORE_PERF_FIXED_CTR_CTRL);
@@ -222,14 +238,13 @@ static bool verify_counter(pmu_counter_t *cnt)
 
 static void check_gp_counter(struct pmu_event *evt)
 {
-	int nr_gp_counters = pmu_nr_gp_counters();
 	pmu_counter_t cnt = {
-		.ctr = gp_counter_base,
 		.config = EVNTSEL_OS | EVNTSEL_USR | evt->unit_sel,
 	};
 	int i;
 
-	for (i = 0; i < nr_gp_counters; i++, cnt.ctr++) {
+	for (i = 0; i < nr_gp_counters; i++) {
+		cnt.ctr = get_gp_counter_msr(i);
 		measure_one(&cnt);
 		report(verify_event(cnt.count, evt), "%s-%d", evt->name, i);
 	}
@@ -239,12 +254,11 @@ static void check_gp_counters(void)
 {
 	int i;
 
-	for (i = 0; i < sizeof(gp_events)/sizeof(gp_events[0]); i++)
+	for (i = 0; i < gp_events_size; i++)
 		if (pmu_gp_counter_is_available(i))
-			check_gp_counter(&gp_events[i]);
+			check_gp_counter(&(*gp_events)[i]);
 		else
-			printf("GP event '%s' is disabled\n",
-					gp_events[i].name);
+			printf("GP event '%s' is disabled\n", (*gp_events)[i].name);
 }
 
 static void check_fixed_counters(void)
@@ -265,7 +279,6 @@ static void check_fixed_counters(void)
 static void check_counters_many(void)
 {
 	int nr_fixed_counters = pmu_nr_fixed_counters();
-	int nr_gp_counters = pmu_nr_gp_counters();
 	pmu_counter_t cnt[10];
 	int i, n;
 
@@ -273,9 +286,8 @@ static void check_counters_many(void)
 		if (!pmu_gp_counter_is_available(i))
 			continue;
 
-		cnt[n].ctr = gp_counter_base + n;
-		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR |
-			gp_events[i % ARRAY_SIZE(gp_events)].unit_sel;
+		cnt[n].ctr = get_gp_counter_msr(n);
+		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[i % gp_events_size].unit_sel;
 		n++;
 	}
 	for (i = 0; i < nr_fixed_counters; i++) {
@@ -295,12 +307,11 @@ static void check_counters_many(void)
 
 static void check_counter_overflow(void)
 {
-	int nr_gp_counters = pmu_nr_gp_counters();
 	uint64_t count;
 	int i;
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
+		.config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[1].unit_sel /* instructions */,
 	};
 	__measure(&cnt, 0);
 	count = cnt.count;
@@ -313,10 +324,11 @@ static void check_counter_overflow(void)
 
 	report_prefix_push("overflow");
 
-	for (i = 0; i < nr_gp_counters + 1; i++, cnt.ctr++) {
+	for (i = 0; i < nr_gp_counters + 1; i++) {
 		uint64_t status;
 		int idx;
 
+		cnt.ctr = get_gp_counter_msr(i);
 		cnt.count = 1 - count;
 		if (gp_counter_base == MSR_IA32_PMC0)
 			cnt.count &= (1ull << pmu_gp_counter_width()) - 1;
@@ -359,11 +371,11 @@ static void check_gp_counter_cmask(void)
 {
 	pmu_counter_t cnt = {
 		.ctr = gp_counter_base,
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
+		.config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[1].unit_sel /* instructions */,
 	};
 	cnt.config |= (0x2 << EVNTSEL_CMASK_SHIFT);
 	measure_one(&cnt);
-	report(cnt.count < gp_events[1].min, "cmask");
+	report(cnt.count < (*gp_events)[1].min, "cmask");
 }
 
 static void do_rdpmc_fast(void *ptr)
@@ -383,7 +395,6 @@ static void check_rdpmc(void)
 	int fixed_counter_width = pmu_fixed_counter_width();
 	int nr_fixed_counters = pmu_nr_fixed_counters();
 	u8 gp_counter_width = pmu_gp_counter_width();
-	int nr_gp_counters = pmu_nr_gp_counters();
 	uint64_t val = 0xff0123456789ull;
 	bool exc;
 	int i;
@@ -393,7 +404,7 @@ static void check_rdpmc(void)
 	for (i = 0; i < nr_gp_counters; i++) {
 		uint64_t x;
 		pmu_counter_t cnt = {
-			.ctr = gp_counter_base + i,
+			.ctr = get_gp_counter_msr(i),
 			.idx = i
 		};
 
@@ -409,7 +420,7 @@ static void check_rdpmc(void)
 		/* Mask according to the number of supported bits */
 		x &= (1ull << gp_counter_width) - 1;
 
-		wrmsr(gp_counter_base + i, val);
+		wrmsr(get_gp_counter_msr(i), val);
 		report(rdpmc(i) == x, "cntr-%d", i);
 
 		exc = test_for_exception(GP_VECTOR, do_rdpmc_fast, &cnt);
@@ -444,7 +455,7 @@ static void check_running_counter_wrmsr(void)
 	uint64_t count;
 	pmu_counter_t evt = {
 		.ctr = gp_counter_base,
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
+		.config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[1].unit_sel,
 	};
 
 	report_prefix_push("running counter wrmsr");
@@ -453,7 +464,7 @@ static void check_running_counter_wrmsr(void)
 	loop();
 	wrmsr(gp_counter_base, 0);
 	stop_event(&evt);
-	report(evt.count < gp_events[1].min, "cntr");
+	report(evt.count < (*gp_events)[1].min, "cntr");
 
 	/* clear status before overflow test */
 	if (pmu_version() > 1) {
@@ -483,15 +494,16 @@ static void check_running_counter_wrmsr(void)
 static void check_emulated_instr(void)
 {
 	uint64_t status, instr_start, brnch_start;
+	unsigned int branch_idx = 5;
 	pmu_counter_t brnch_cnt = {
-		.ctr = MSR_IA32_PERFCTR0,
+		.ctr = get_gp_counter_msr(0),
 		/* branch instructions */
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[5].unit_sel,
+		.config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[branch_idx].unit_sel,
 	};
 	pmu_counter_t instr_cnt = {
-		.ctr = MSR_IA32_PERFCTR0 + 1,
+		.ctr = get_gp_counter_msr(1),
 		/* instructions */
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel,
+		.config = EVNTSEL_OS | EVNTSEL_USR | intel_gp_events[1].unit_sel,
 	};
 	report_prefix_push("emulated instruction");
 
@@ -505,8 +517,8 @@ static void check_emulated_instr(void)
 
 	brnch_start = -EXPECTED_BRNCH;
 	instr_start = -EXPECTED_INSTR;
-	wrmsr(MSR_IA32_PERFCTR0, brnch_start);
-	wrmsr(MSR_IA32_PERFCTR0 + 1, instr_start);
+	wrmsr(get_gp_counter_msr(0), brnch_start);
+	wrmsr(get_gp_counter_msr(1), instr_start);
 	// KVM_FEP is a magic prefix that forces emulation so
 	// 'KVM_FEP "jne label\n"' just counts as a single instruction.
 	asm volatile(
@@ -579,7 +591,6 @@ static void check_gp_counters_write_width(void)
 	u64 val_64 = 0xffffff0123456789ull;
 	u64 val_32 = val_64 & ((1ull << 32) - 1);
 	u64 val_max_width = val_64 & ((1ull << pmu_gp_counter_width()) - 1);
-	int nr_gp_counters = pmu_nr_gp_counters();
 	int i;
 
 	/*
@@ -627,14 +638,14 @@ static void check_gp_counters_write_width(void)
 static void set_ref_cycle_expectations(void)
 {
 	pmu_counter_t cnt = {
-		.ctr = MSR_IA32_PERFCTR0,
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[2].unit_sel,
+		.ctr = get_gp_counter_msr(0),
+		.config = EVNTSEL_OS | EVNTSEL_USR | intel_gp_events[2].unit_sel,
 	};
 	uint64_t tsc_delta;
 	uint64_t t0, t1, t2, t3;
 
 	/* Bit 2 enumerates the availability of reference cycles events. */
-	if (!pmu_nr_gp_counters() || !pmu_gp_counter_is_available(2))
+	if (!nr_gp_counters || !pmu_gp_counter_is_available(2))
 		return;
 
 	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
@@ -663,8 +674,8 @@ static void set_ref_cycle_expectations(void)
 	if (!tsc_delta)
 		return;
 
-	gp_events[2].min = (gp_events[2].min * cnt.count) / tsc_delta;
-	gp_events[2].max = (gp_events[2].max * cnt.count) / tsc_delta;
+	intel_gp_events[2].min = (intel_gp_events[2].min * cnt.count) / tsc_delta;
+	intel_gp_events[2].max = (intel_gp_events[2].max * cnt.count) / tsc_delta;
 }
 
 static bool detect_intel_pmu(void)
@@ -674,6 +685,12 @@ static bool detect_intel_pmu(void)
 		return false;
 	}
 
+	nr_gp_counters = pmu_nr_gp_counters();
+	gp_events_size = sizeof(intel_gp_events)/sizeof(intel_gp_events[0]);
+	gp_events = (PMU_EVENTS_ARRAY_t *)intel_gp_events;
+	gp_counter_base = MSR_IA32_PERFCTR0;
+	gp_select_base = MSR_P6_EVNTSEL0;
+
 	report_prefix_push("Intel");
 	return true;
 }
@@ -700,7 +717,7 @@ int main(int ac, char **av)
 	set_ref_cycle_expectations();
 
 	printf("PMU version:         %d\n", pmu_version());
-	printf("GP counters:         %d\n", pmu_nr_gp_counters());
+	printf("GP counters:         %d\n", nr_gp_counters);
 	printf("GP counter width:    %d\n", pmu_gp_counter_width());
 	printf("Mask length:         %d\n", pmu_gp_counter_mask_length());
 	printf("Fixed counters:      %d\n", pmu_nr_fixed_counters());
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU
  2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
                   ` (11 preceding siblings ...)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources Like Xu
@ 2022-08-19 11:09 ` Like Xu
  2022-09-06  7:32   ` Sandipan Das
  2022-10-05 22:48   ` Sean Christopherson
  12 siblings, 2 replies; 39+ messages in thread
From: Like Xu @ 2022-08-19 11:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm

From: Like Xu <likexu@tencent.com>

AMD core PMU before Zen4 did not have version numbers, there were
no fixed counters, it had a hard-coded number of generic counters,
bit-width, and only hardware events common across amd generations
(starting with K7) were added to amd_gp_events[] table.

All above differences are instantiated at the detection step, and it
also covers the K7 PMU registers, which is consistent with bare-metal.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 lib/x86/msr.h       | 17 ++++++++++++
 lib/x86/processor.h | 32 ++++++++++++++++++++--
 x86/pmu.c           | 67 ++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 106 insertions(+), 10 deletions(-)

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index 252e041..5f16a58 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -130,6 +130,23 @@
 #define MSR_AMD64_IBSDCPHYSAD		0xc0011039
 #define MSR_AMD64_IBSCTL		0xc001103a
 
+/* Fam 15h MSRs */
+#define MSR_F15H_PERF_CTL              0xc0010200
+#define MSR_F15H_PERF_CTL0             MSR_F15H_PERF_CTL
+#define MSR_F15H_PERF_CTL1             (MSR_F15H_PERF_CTL + 2)
+#define MSR_F15H_PERF_CTL2             (MSR_F15H_PERF_CTL + 4)
+#define MSR_F15H_PERF_CTL3             (MSR_F15H_PERF_CTL + 6)
+#define MSR_F15H_PERF_CTL4             (MSR_F15H_PERF_CTL + 8)
+#define MSR_F15H_PERF_CTL5             (MSR_F15H_PERF_CTL + 10)
+
+#define MSR_F15H_PERF_CTR              0xc0010201
+#define MSR_F15H_PERF_CTR0             MSR_F15H_PERF_CTR
+#define MSR_F15H_PERF_CTR1             (MSR_F15H_PERF_CTR + 2)
+#define MSR_F15H_PERF_CTR2             (MSR_F15H_PERF_CTR + 4)
+#define MSR_F15H_PERF_CTR3             (MSR_F15H_PERF_CTR + 6)
+#define MSR_F15H_PERF_CTR4             (MSR_F15H_PERF_CTR + 8)
+#define MSR_F15H_PERF_CTR5             (MSR_F15H_PERF_CTR + 10)
+
 /* Fam 10h MSRs */
 #define MSR_FAM10H_MMIO_CONF_BASE	0xc0010058
 #define FAM10H_MMIO_CONF_ENABLE		(1<<0)
diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index 0324220..10bca27 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -793,6 +793,9 @@ static inline void flush_tlb(void)
 
 static inline u8 pmu_version(void)
 {
+	if (!is_intel())
+		return 0;
+
 	return cpuid(10).a & 0xff;
 }
 
@@ -806,19 +809,39 @@ static inline bool this_cpu_has_perf_global_ctrl(void)
 	return pmu_version() > 1;
 }
 
+#define AMD64_NUM_COUNTERS                             4
+#define AMD64_NUM_COUNTERS_CORE                                6
+
+static inline bool has_amd_perfctr_core(void)
+{
+	return cpuid(0x80000001).c & BIT_ULL(23);
+}
+
 static inline u8 pmu_nr_gp_counters(void)
 {
-	return (cpuid(10).a >> 8) & 0xff;
+	if (is_intel()) {
+		return (cpuid(10).a >> 8) & 0xff;
+	} else if (!has_amd_perfctr_core()) {
+		return AMD64_NUM_COUNTERS;
+	}
+
+	return AMD64_NUM_COUNTERS_CORE;
 }
 
 static inline u8 pmu_gp_counter_width(void)
 {
-	return (cpuid(10).a >> 16) & 0xff;
+	if (is_intel())
+		return (cpuid(10).a >> 16) & 0xff;
+	else
+		return 48;
 }
 
 static inline u8 pmu_gp_counter_mask_length(void)
 {
-	return (cpuid(10).a >> 24) & 0xff;
+	if (is_intel())
+		return (cpuid(10).a >> 24) & 0xff;
+	else
+		return pmu_nr_gp_counters();
 }
 
 static inline u8 pmu_nr_fixed_counters(void)
@@ -843,6 +866,9 @@ static inline u8 pmu_fixed_counter_width(void)
 
 static inline bool pmu_gp_counter_is_available(int i)
 {
+	if (!is_intel())
+		return i < pmu_nr_gp_counters();
+
 	/* CPUID.0xA.EBX bit is '1 if they counter is NOT available. */
 	return !(cpuid(10).b & BIT(i));
 }
diff --git a/x86/pmu.c b/x86/pmu.c
index 0706cb1..b6ab10c 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -62,6 +62,11 @@ struct pmu_event {
 	{"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
 	{"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
 	{"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
+}, amd_gp_events[] = {
+	{"core cycles", 0x0076, 1*N, 50*N},
+	{"instructions", 0x00c0, 10*N, 10.2*N},
+	{"branches", 0x00c2, 1*N, 1.1*N},
+	{"branch misses", 0x00c3, 0, 0.1*N},
 };
 
 #define PMU_CAP_FW_WRITES	(1ULL << 13)
@@ -105,14 +110,24 @@ static bool check_irq(void)
 
 static bool is_gp(pmu_counter_t *evt)
 {
+	if (!is_intel())
+		return true;
+
 	return evt->ctr < MSR_CORE_PERF_FIXED_CTR0 ||
 		evt->ctr >= MSR_IA32_PMC0;
 }
 
 static int event_to_global_idx(pmu_counter_t *cnt)
 {
-	return cnt->ctr - (is_gp(cnt) ? gp_counter_base :
-		(MSR_CORE_PERF_FIXED_CTR0 - FIXED_CNT_INDEX));
+	if (is_intel())
+		return cnt->ctr - (is_gp(cnt) ? gp_counter_base :
+			(MSR_CORE_PERF_FIXED_CTR0 - FIXED_CNT_INDEX));
+
+	if (gp_counter_base == MSR_F15H_PERF_CTR0) {
+		return (cnt->ctr - gp_counter_base) / 2;
+	} else {
+		return cnt->ctr - gp_counter_base;
+	}
 }
 
 static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
@@ -150,11 +165,17 @@ static void global_disable(pmu_counter_t *cnt)
 
 static inline uint32_t get_gp_counter_msr(unsigned int i)
 {
+	if (gp_counter_base == MSR_F15H_PERF_CTR0)
+		return gp_counter_base + 2 * i;
+
 	return gp_counter_base + i;
 }
 
 static inline uint32_t get_gp_select_msr(unsigned int i)
 {
+	if (gp_select_base == MSR_F15H_PERF_CTL0)
+		return gp_select_base + 2 * i;
+
 	return gp_select_base + i;
 }
 
@@ -334,6 +355,9 @@ static void check_counter_overflow(void)
 			cnt.count &= (1ull << pmu_gp_counter_width()) - 1;
 
 		if (i == nr_gp_counters) {
+			if (!is_intel())
+				break;
+
 			cnt.ctr = fixed_events[0].unit_sel;
 			__measure(&cnt, 0);
 			count = cnt.count;
@@ -494,7 +518,7 @@ static void check_running_counter_wrmsr(void)
 static void check_emulated_instr(void)
 {
 	uint64_t status, instr_start, brnch_start;
-	unsigned int branch_idx = 5;
+	unsigned int branch_idx = is_intel() ? 5 : 2;
 	pmu_counter_t brnch_cnt = {
 		.ctr = get_gp_counter_msr(0),
 		/* branch instructions */
@@ -695,13 +719,35 @@ static bool detect_intel_pmu(void)
 	return true;
 }
 
-static bool pmu_is_detected(void)
+static void amd_switch_to_non_perfctr_core(void)
 {
-	if (!is_intel()) {
-		report_skip("AMD PMU is not supported.");
+	gp_counter_base = MSR_K7_PERFCTR0;
+	gp_select_base = MSR_K7_EVNTSEL0;
+	nr_gp_counters = AMD64_NUM_COUNTERS;
+}
+
+static bool detect_amd_pmu(void)
+{
+	if (!has_amd_perfctr_core()) {
+		report_skip("Missing perfctr_core, unsupported AMD PMU.");
 		return false;
 	}
 
+	nr_gp_counters = pmu_nr_gp_counters();
+	gp_events_size = sizeof(amd_gp_events)/sizeof(amd_gp_events[0]);
+	gp_events = (PMU_EVENTS_ARRAY_t *)amd_gp_events;
+	gp_counter_base = MSR_F15H_PERF_CTR0;
+	gp_select_base = MSR_F15H_PERF_CTL0;
+
+	report_prefix_push("AMD");
+	return true;
+}
+
+static bool pmu_is_detected(void)
+{
+	if (!is_intel())
+		return detect_amd_pmu();
+
 	return detect_intel_pmu();
 }
 
@@ -714,7 +760,8 @@ int main(int ac, char **av)
 	if (!pmu_is_detected())
 		return report_summary();
 
-	set_ref_cycle_expectations();
+	if (is_intel())
+		set_ref_cycle_expectations();
 
 	printf("PMU version:         %d\n", pmu_version());
 	printf("GP counters:         %d\n", nr_gp_counters);
@@ -736,5 +783,11 @@ int main(int ac, char **av)
 		report_prefix_pop();
 	}
 
+	if (!is_intel()) {
+		report_prefix_push("K7");
+		amd_switch_to_non_perfctr_core();
+		check_counters();
+	}
+
 	return report_summary();
 }
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1 Like Xu
@ 2022-09-06  7:15   ` Sandipan Das
  2022-09-06 13:28     ` Like Xu
  2022-09-06  8:16   ` Sandipan Das
  2022-10-05 22:35   ` Sean Christopherson
  2 siblings, 1 reply; 39+ messages in thread
From: Sandipan Das @ 2022-09-06  7:15 UTC (permalink / raw)
  To: Like Xu; +Cc: kvm, Sean Christopherson, Paolo Bonzini

Hi Like,

On 8/19/2022 4:39 PM, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> For most unit tests, the basic framework and use cases which test
> any PMU counter do not require any changes, except for two things:
> 
> - No access to registers introduced only in PMU version 2 and above;
> - Expanded tolerance for testing counter overflows
>   due to the loss of uniform control of the gloabl_ctrl register
> 
> Adding some pmu_version() return value checks can seamlessly support
> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 43 insertions(+), 21 deletions(-)
> 
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 25fafbe..826472c 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> [...]
> @@ -520,10 +544,13 @@ static void check_emulated_instr(void)
>  	       "instruction count");
>  	report(brnch_cnt.count - brnch_start >= EXPECTED_BRNCH,
>  	       "branch count");
> -	// Additionally check that those counters overflowed properly.
> -	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
> -	report(status & 1, "instruction counter overflow");
> -	report(status & 2, "branch counter overflow");
> +
> +	if (pmu_version() > 1) {
> +		// Additionally check that those counters overflowed properly.
> +		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
> +		report(status & 1, "instruction counter overflow");
> +		report(status & 2, "branch counter overflow");
> +	}
>  

This should use status bit 1 for instructions and bit 0 for branches.

>  	report_prefix_pop();
>  }
> [...]  

- Sandipan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources Like Xu
@ 2022-09-06  7:19   ` Sandipan Das
  2022-10-05 22:44   ` Sean Christopherson
  1 sibling, 0 replies; 39+ messages in thread
From: Sandipan Das @ 2022-09-06  7:19 UTC (permalink / raw)
  To: Like Xu; +Cc: kvm, Sean Christopherson, Paolo Bonzini

On 8/19/2022 4:39 PM, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> This is a pre-requisite operation for adding AMD PMU tests.
> 
> AMD and Intel PMU are different in counter registers, event selection
> registers, number of generic registers and generic hardware events.
> By introducing a set of global variables to facilitate assigning
> different values on different platforms.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  x86/pmu.c | 99 ++++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 58 insertions(+), 41 deletions(-)
> 
> 
Reviewed-by: Sandipan Das <sandipan.das@amd.com>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU Like Xu
@ 2022-09-06  7:32   ` Sandipan Das
  2022-10-05 22:48   ` Sean Christopherson
  1 sibling, 0 replies; 39+ messages in thread
From: Sandipan Das @ 2022-09-06  7:32 UTC (permalink / raw)
  To: Like Xu; +Cc: kvm, Sean Christopherson, Paolo Bonzini

On 8/19/2022 4:39 PM, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> AMD core PMU before Zen4 did not have version numbers, there were
> no fixed counters, it had a hard-coded number of generic counters,
> bit-width, and only hardware events common across amd generations
> (starting with K7) were added to amd_gp_events[] table.
> 
> All above differences are instantiated at the detection step, and it
> also covers the K7 PMU registers, which is consistent with bare-metal.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  lib/x86/msr.h       | 17 ++++++++++++
>  lib/x86/processor.h | 32 ++++++++++++++++++++--
>  x86/pmu.c           | 67 ++++++++++++++++++++++++++++++++++++++++-----
>  3 files changed, 106 insertions(+), 10 deletions(-)
> 
> [...]

Reviewed-by: Sandipan Das <sandipan.das@amd.com>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1 Like Xu
  2022-09-06  7:15   ` Sandipan Das
@ 2022-09-06  8:16   ` Sandipan Das
  2022-09-06 13:35     ` Like Xu
  2022-10-05 22:35   ` Sean Christopherson
  2 siblings, 1 reply; 39+ messages in thread
From: Sandipan Das @ 2022-09-06  8:16 UTC (permalink / raw)
  To: Like Xu; +Cc: kvm, Sean Christopherson, Paolo Bonzini

Hi Like,

On 8/19/2022 4:39 PM, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> For most unit tests, the basic framework and use cases which test
> any PMU counter do not require any changes, except for two things:
> 
> - No access to registers introduced only in PMU version 2 and above;
> - Expanded tolerance for testing counter overflows
>   due to the loss of uniform control of the gloabl_ctrl register
> 
> Adding some pmu_version() return value checks can seamlessly support
> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 43 insertions(+), 21 deletions(-)
> 
> [...] 
> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>  			cnt.config &= ~EVNTSEL_INT;
>  		idx = event_to_global_idx(&cnt);
>  		__measure(&cnt, cnt.count);
> -		report(cnt.count == 1, "cntr-%d", i);
> +
> +		report(check_irq() == (i % 2), "irq-%d", i);
> +		if (pmu_version() > 1)
> +			report(cnt.count == 1, "cntr-%d", i);
> +		else
> +			report(cnt.count < 4, "cntr-%d", i);
> +
> [...]

Sorry I missed this in the previous response. With an upper bound of
4, I see this test failing some times for at least one of the six
counters (with NMI watchdog disabled on the host) on a Milan (Zen 3)
system. Increasing it further does reduce the probability but I still
see failures. Do you see the same behaviour on systems with Zen 3 and
older processors?

- Sandipan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-09-06  7:15   ` Sandipan Das
@ 2022-09-06 13:28     ` Like Xu
  0 siblings, 0 replies; 39+ messages in thread
From: Like Xu @ 2022-09-06 13:28 UTC (permalink / raw)
  To: Sandipan Das; +Cc: kvm, Sean Christopherson, Paolo Bonzini

On 6/9/2022 3:15 pm, Sandipan Das wrote:
> Hi Like,
> 
> On 8/19/2022 4:39 PM, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> For most unit tests, the basic framework and use cases which test
>> any PMU counter do not require any changes, except for two things:
>>
>> - No access to registers introduced only in PMU version 2 and above;
>> - Expanded tolerance for testing counter overflows
>>    due to the loss of uniform control of the gloabl_ctrl register
>>
>> Adding some pmu_version() return value checks can seamlessly support
>> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
>>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>>   x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>>   1 file changed, 43 insertions(+), 21 deletions(-)
>>
>> diff --git a/x86/pmu.c b/x86/pmu.c
>> index 25fafbe..826472c 100644
>> --- a/x86/pmu.c
>> +++ b/x86/pmu.c
>> [...]
>> @@ -520,10 +544,13 @@ static void check_emulated_instr(void)
>>   	       "instruction count");
>>   	report(brnch_cnt.count - brnch_start >= EXPECTED_BRNCH,
>>   	       "branch count");
>> -	// Additionally check that those counters overflowed properly.
>> -	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>> -	report(status & 1, "instruction counter overflow");
>> -	report(status & 2, "branch counter overflow");
>> +
>> +	if (pmu_version() > 1) {
>> +		// Additionally check that those counters overflowed properly.
>> +		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>> +		report(status & 1, "instruction counter overflow");
>> +		report(status & 2, "branch counter overflow");
>> +	}
>>   
> 
> This should use status bit 1 for instructions and bit 0 for branches.

Yes, this misleading statement stems from
20cf914 ("x86/pmu: Test PMU virtualization on emulated instructions")

I will fix it as part of this patch set. Thanks.

> 
>>   	report_prefix_pop();
>>   }
>> [...]
> 
> - Sandipan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-09-06  8:16   ` Sandipan Das
@ 2022-09-06 13:35     ` Like Xu
  2022-09-08  8:23       ` Sandipan Das
  0 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-09-06 13:35 UTC (permalink / raw)
  To: Sandipan Das; +Cc: kvm, Sean Christopherson, Paolo Bonzini

On 6/9/2022 4:16 pm, Sandipan Das wrote:
> Hi Like,
> 
> On 8/19/2022 4:39 PM, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> For most unit tests, the basic framework and use cases which test
>> any PMU counter do not require any changes, except for two things:
>>
>> - No access to registers introduced only in PMU version 2 and above;
>> - Expanded tolerance for testing counter overflows
>>    due to the loss of uniform control of the gloabl_ctrl register
>>
>> Adding some pmu_version() return value checks can seamlessly support
>> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
>>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>>   x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>>   1 file changed, 43 insertions(+), 21 deletions(-)
>>
>> [...]
>> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>>   			cnt.config &= ~EVNTSEL_INT;
>>   		idx = event_to_global_idx(&cnt);
>>   		__measure(&cnt, cnt.count);
>> -		report(cnt.count == 1, "cntr-%d", i);
>> +
>> +		report(check_irq() == (i % 2), "irq-%d", i);
>> +		if (pmu_version() > 1)
>> +			report(cnt.count == 1, "cntr-%d", i);
>> +		else
>> +			report(cnt.count < 4, "cntr-%d", i);
>> +
>> [...]
> 
> Sorry I missed this in the previous response. With an upper bound of
> 4, I see this test failing some times for at least one of the six
> counters (with NMI watchdog disabled on the host) on a Milan (Zen 3)
> system. Increasing it further does reduce the probability but I still
> see failures. Do you see the same behaviour on systems with Zen 3 and
> older processors?

A hundred runs on my machine did not report a failure.

But I'm not surprised by this, because some AMD platforms do
have hw PMU errata which requires bios or ucode fixes.

Please help find the right upper bound for all your available AMD boxes.

What makes me most nervous is that AMD's core hardware events run
repeatedly against the same workload, and their count results are erratic.

You may check is_the_count_reproducible() in the test case:
[1]https://lore.kernel.org/kvm/20220905123946.95223-7-likexu@tencent.com/

> 
> - Sandipan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-09-06 13:35     ` Like Xu
@ 2022-09-08  8:23       ` Sandipan Das
  2022-09-19  7:09         ` Like Xu
  0 siblings, 1 reply; 39+ messages in thread
From: Sandipan Das @ 2022-09-08  8:23 UTC (permalink / raw)
  To: Like Xu; +Cc: kvm, Sean Christopherson, Paolo Bonzini

On 9/6/2022 7:05 PM, Like Xu wrote:
> On 6/9/2022 4:16 pm, Sandipan Das wrote:
>> Hi Like,
>>
>> On 8/19/2022 4:39 PM, Like Xu wrote:
>>> From: Like Xu <likexu@tencent.com>
>>>
>>> For most unit tests, the basic framework and use cases which test
>>> any PMU counter do not require any changes, except for two things:
>>>
>>> - No access to registers introduced only in PMU version 2 and above;
>>> - Expanded tolerance for testing counter overflows
>>>    due to the loss of uniform control of the gloabl_ctrl register
>>>
>>> Adding some pmu_version() return value checks can seamlessly support
>>> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
>>>
>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>> ---
>>>   x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>>>   1 file changed, 43 insertions(+), 21 deletions(-)
>>>
>>> [...]
>>> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>>>               cnt.config &= ~EVNTSEL_INT;
>>>           idx = event_to_global_idx(&cnt);
>>>           __measure(&cnt, cnt.count);
>>> -        report(cnt.count == 1, "cntr-%d", i);
>>> +
>>> +        report(check_irq() == (i % 2), "irq-%d", i);
>>> +        if (pmu_version() > 1)
>>> +            report(cnt.count == 1, "cntr-%d", i);
>>> +        else
>>> +            report(cnt.count < 4, "cntr-%d", i);
>>> +
>>> [...]
>>
>> Sorry I missed this in the previous response. With an upper bound of
>> 4, I see this test failing some times for at least one of the six
>> counters (with NMI watchdog disabled on the host) on a Milan (Zen 3)
>> system. Increasing it further does reduce the probability but I still
>> see failures. Do you see the same behaviour on systems with Zen 3 and
>> older processors?
> 
> A hundred runs on my machine did not report a failure.
> 

Was this on a Zen 4 system?

> But I'm not surprised by this, because some AMD platforms do
> have hw PMU errata which requires bios or ucode fixes.
> 
> Please help find the right upper bound for all your available AMD boxes.
> 

Even after updating the microcode, the tests failed just as often in an
overnight loop. However, upon closer inspection, the reason for failure
was different. The variance is well within the bounds now but sometimes,
is_the_count_reproducible() is true. Since this selects the original
verification criteria (cnt.count == 1), the tests fail.

> What makes me most nervous is that AMD's core hardware events run
> repeatedly against the same workload, and their count results are erratic.
> 

With that in mind, should we consider having the following change?

diff --git a/x86/pmu.c b/x86/pmu.c
index bb16b3c..39979b8 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -352,7 +352,7 @@ static void check_counter_overflow(void)
                .ctr = gp_counter_base,
                .config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[1].unit_sel /* instructions */,
        };
-       bool precise_event = is_the_count_reproducible(&cnt);
+       bool precise_event = is_intel() ? is_the_count_reproducible(&cnt) : false;

        __measure(&cnt, 0);
        count = cnt.count;

With this, the tests always pass. I will run another overnight loop and
report back if I see any errors.

> You may check is_the_count_reproducible() in the test case:
> [1]https://lore.kernel.org/kvm/20220905123946.95223-7-likexu@tencent.com/

On Zen 4 systems, this is always false and the overflow tests always
pass irrespective of whether PerfMonV2 is enabled for the guest or not.

- Sandipan

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-09-08  8:23       ` Sandipan Das
@ 2022-09-19  7:09         ` Like Xu
  2022-10-21  7:32           ` Like Xu
  0 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-09-19  7:09 UTC (permalink / raw)
  To: Sandipan Das; +Cc: kvm, Sean Christopherson, Paolo Bonzini, Jim Mattson

On 8/9/2022 4:23 pm, Sandipan Das wrote:
> On 9/6/2022 7:05 PM, Like Xu wrote:
>> On 6/9/2022 4:16 pm, Sandipan Das wrote:
>>> Hi Like,
>>>
>>> On 8/19/2022 4:39 PM, Like Xu wrote:
>>>> From: Like Xu <likexu@tencent.com>
>>>>
>>>> For most unit tests, the basic framework and use cases which test
>>>> any PMU counter do not require any changes, except for two things:
>>>>
>>>> - No access to registers introduced only in PMU version 2 and above;
>>>> - Expanded tolerance for testing counter overflows
>>>>     due to the loss of uniform control of the gloabl_ctrl register
>>>>
>>>> Adding some pmu_version() return value checks can seamlessly support
>>>> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
>>>>
>>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>>> ---
>>>>    x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>>>>    1 file changed, 43 insertions(+), 21 deletions(-)
>>>>
>>>> [...]
>>>> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>>>>                cnt.config &= ~EVNTSEL_INT;
>>>>            idx = event_to_global_idx(&cnt);
>>>>            __measure(&cnt, cnt.count);
>>>> -        report(cnt.count == 1, "cntr-%d", i);
>>>> +
>>>> +        report(check_irq() == (i % 2), "irq-%d", i);
>>>> +        if (pmu_version() > 1)
>>>> +            report(cnt.count == 1, "cntr-%d", i);
>>>> +        else
>>>> +            report(cnt.count < 4, "cntr-%d", i);
>>>> +
>>>> [...]
>>>
>>> Sorry I missed this in the previous response. With an upper bound of
>>> 4, I see this test failing some times for at least one of the six
>>> counters (with NMI watchdog disabled on the host) on a Milan (Zen 3)
>>> system. Increasing it further does reduce the probability but I still
>>> see failures. Do you see the same behaviour on systems with Zen 3 and
>>> older processors?
>>
>> A hundred runs on my machine did not report a failure.
>>
> 
> Was this on a Zen 4 system?
> 
>> But I'm not surprised by this, because some AMD platforms do
>> have hw PMU errata which requires bios or ucode fixes.
>>
>> Please help find the right upper bound for all your available AMD boxes.
>>
> 
> Even after updating the microcode, the tests failed just as often in an
> overnight loop. However, upon closer inspection, the reason for failure
> was different. The variance is well within the bounds now but sometimes,
> is_the_count_reproducible() is true. Since this selects the original
> verification criteria (cnt.count == 1), the tests fail.
> 
>> What makes me most nervous is that AMD's core hardware events run
>> repeatedly against the same workload, and their count results are erratic.
>>
> 
> With that in mind, should we consider having the following change?
> 
> diff --git a/x86/pmu.c b/x86/pmu.c
> index bb16b3c..39979b8 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -352,7 +352,7 @@ static void check_counter_overflow(void)
>                  .ctr = gp_counter_base,
>                  .config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[1].unit_sel /* instructions */,
>          };
> -       bool precise_event = is_the_count_reproducible(&cnt);
> +       bool precise_event = is_intel() ? is_the_count_reproducible(&cnt) : false;
> 
>          __measure(&cnt, 0);
>          count = cnt.count;
> 
> With this, the tests always pass. I will run another overnight loop and
> report back if I see any errors.
> 
>> You may check is_the_count_reproducible() in the test case:
>> [1]https://lore.kernel.org/kvm/20220905123946.95223-7-likexu@tencent.com/
> 
> On Zen 4 systems, this is always false and the overflow tests always
> pass irrespective of whether PerfMonV2 is enabled for the guest or not.
> 
> - Sandipan

I could change it to:

		if (is_intel())
			report(cnt.count == 1, "cntr-%d", i);
		else
			report(cnt.count < 4, "cntr-%d", i);

but this does not explain the difference, that is for the same workload:

if a retired hw event like "PMCx0C0 [Retired Instructions] (ExRetInstr)" is 
configured,
then it's expected to count "the number of instructions retired", the value is 
only relevant
for workload and it should remain the same over multiple measurements,

but there are two hardware counters, one AMD and one Intel, both are reset to an 
identical value
(like "cnt.count = 1 - count"), and when they overflow, the Intel counter can 
stay exactly at 1,
while the AMD counter cannot.

I know there are ulterior hardware micro-arch implementation differences here,
but what AMD is doing violates the semantics of "retired".

Is this behavior normal by design ?
I'm not sure what I'm missing, this behavior is reinforced in zen4 as you said.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386 Like Xu
@ 2022-10-05 22:18   ` Sean Christopherson
  2022-10-17  7:30     ` Like Xu
  0 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:18 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> The pmu test check_counter_overflow() always fails with the "./configure
> --arch=i386".

Explicitly state that the failures are with 32-bit binaries.  E.g. I can and do
run KUT in 32-bit VMs, which doesn't require the explicit --arch=i386.

> The cnt.count obtained from the latter run of measure()
> (based on fixed counter 0) is not equal to the expected value (based
> on gp counter 0) and there is a positive error with a value of 2.
> 
> The two extra instructions come from inline wrmsr() and inline rdmsr()
> inside the global_disable() binary code block. Specifically, for each msr
> access, the i386 code will have two assembly mov instructions before
> rdmsr/wrmsr (mark it for fixed counter 0, bit 32), but only one assembly
> mov is needed for x86_64 and gp counter 0 on i386.
> 
> Fix the expected init cnt.count for fixed counter 0 overflow based on
> the same fixed counter 0, not always using gp counter 0.

You lost me here.  I totally understand the problem, but I don't understand the
fix.

> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  x86/pmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 45ca2c6..057fd4a 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -315,6 +315,9 @@ static void check_counter_overflow(void)
>  
>  		if (i == nr_gp_counters) {
>  			cnt.ctr = fixed_events[0].unit_sel;
> +			__measure(&cnt, 0);
> +			count = cnt.count;
> +			cnt.count = 1 - count;

This definitely needs a comment.

Dumb question time: if the count is off by 2, why can't we just subtract 2?

#ifndef __x86_64__
			/* comment about extra MOV insns for RDMSR/WRMSR */
			cnt.count -= 2;
#endif

>  			cnt.count &= (1ull << pmu_fixed_counter_width()) - 1;
>  		}
>  
> -- 
> 2.37.2
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 04/13] x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS)
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 04/13] x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS) Like Xu
@ 2022-10-05 22:21   ` Sean Christopherson
  2022-10-05 22:22     ` Sean Christopherson
  0 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:21 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> This unit-test is intended to test the KVM's support for
> the Processor Event Based Sampling (PEBS) which is another
> PMU feature on Intel processors (start from Ice Lake Server).
> 
> If a bit in PEBS_ENABLE is set to 1, its corresponding counter will
> write at least one PEBS records (including partial state of the vcpu
> at the time of the current hardware event) to the guest memory on
> counter overflow, and trigger an interrupt at a specific DS state.
> The format of a PEBS record can be configured by another register.
> 
> These tests cover most usage scenarios, for example there are some
> specially constructed scenarios (not a typical behaviour of Linux
> PEBS driver). It lowers the threshold for others to understand this
> feature and opens up more exploration of KVM implementation or
> hw feature itself.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>

I responded to the previous version, didn't realize it got merged in with this
series (totally fine, I just missed it).  I think all my feedback still applies.

https://lore.kernel.org/all/Yz3XiP6GBp95YEW9@google.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 04/13] x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS)
  2022-10-05 22:21   ` Sean Christopherson
@ 2022-10-05 22:22     ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:22 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Wed, Oct 05, 2022, Sean Christopherson wrote:
> On Fri, Aug 19, 2022, Like Xu wrote:
> > From: Like Xu <likexu@tencent.com>
> > 
> > This unit-test is intended to test the KVM's support for
> > the Processor Event Based Sampling (PEBS) which is another
> > PMU feature on Intel processors (start from Ice Lake Server).
> > 
> > If a bit in PEBS_ENABLE is set to 1, its corresponding counter will
> > write at least one PEBS records (including partial state of the vcpu
> > at the time of the current hardware event) to the guest memory on
> > counter overflow, and trigger an interrupt at a specific DS state.
> > The format of a PEBS record can be configured by another register.
> > 
> > These tests cover most usage scenarios, for example there are some
> > specially constructed scenarios (not a typical behaviour of Linux
> > PEBS driver). It lowers the threshold for others to understand this
> > feature and opens up more exploration of KVM implementation or
> > hw feature itself.
> > 
> > Signed-off-by: Like Xu <likexu@tencent.com>
> 
> I responded to the previous version, didn't realize it got merged in with this
> series (totally fine, I just missed it).  I think all my feedback still applies.
> 
> https://lore.kernel.org/all/Yz3XiP6GBp95YEW9@google.com

Ha, and then I saw the addition of "groups = pmu" in the next patch :-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 05/13] x86: create pmu group for quick pmu-scope testing
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 05/13] x86: create pmu group for quick pmu-scope testing Like Xu
@ 2022-10-05 22:23   ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:23 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Any agent can run "./run_tests.sh -g pmu" for vPMU-related testcases.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 07/13] x86/pmu: Pop up FW prefix to avoid out-of-context propagation
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 07/13] x86/pmu: Pop up FW prefix to avoid out-of-context propagation Like Xu
@ 2022-10-05 22:25   ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:25 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> The inappropriate prefix may be propagated to later test cases if any.

Nit, please make changelogs standalone, i.e. don't depend on the shortlog for
context.

> Signed-off-by: Like Xu <likexu@tencent.com>

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 08/13] x86/pmu: Add PDCM check before accessing PERF_CAP register
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 08/13] x86/pmu: Add PDCM check before accessing PERF_CAP register Like Xu
@ 2022-10-05 22:28   ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:28 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> On virtual platforms without PDCM support (e.g. AMD), #GP
> failure on MSR_IA32_PERF_CAPABILITIES is completely avoidable.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  x86/pmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 4eb92d8..25fafbe 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -669,7 +669,8 @@ int main(int ac, char **av)
>  
>  	check_counters();
>  
> -	if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES) {
> +	if (this_cpu_has(X86_FEATURE_PDCM) &&
> +	    (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)) {


pmu_pebs.c ends up with similar code.  Maybe add a helper?

  static inline u64 this_cpu_perf_capabilities(void)
  {
	if (!this_cpu_has(X86_FEATURE_PDCM))
		return 0;

	rdmsr(MSR_IA32_PERF_CAPABILITIES);
  }

Or even better, if we end up with a lib/pmu and something like "struct pmu_caps",
the RDMSR can be done once in the library's init routine.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 09/13] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 09/13] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms Like Xu
@ 2022-10-05 22:29   ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:29 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> The test conclusion of running Intel LBR on AMD platforms
> should not be PASS, but SKIP, fix it.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1 Like Xu
  2022-09-06  7:15   ` Sandipan Das
  2022-09-06  8:16   ` Sandipan Das
@ 2022-10-05 22:35   ` Sean Christopherson
  2022-10-18  9:32     ` Like Xu
  2 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:35 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> For most unit tests, the basic framework and use cases which test
> any PMU counter do not require any changes, except for two things:
> 
> - No access to registers introduced only in PMU version 2 and above;
> - Expanded tolerance for testing counter overflows
>   due to the loss of uniform control of the gloabl_ctrl register
> 
> Adding some pmu_version() return value checks can seamlessly support
> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.

Phrase this as a command so that it's crystal clear that this is what the patch
does, as opposed to what the patch _can_ do.

> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 43 insertions(+), 21 deletions(-)
> 
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 25fafbe..826472c 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -125,14 +125,19 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
>  
>  static void global_enable(pmu_counter_t *cnt)
>  {
> -	cnt->idx = event_to_global_idx(cnt);
> +	if (pmu_version() < 2)

Helper please.

> +		return;
>  
> +	cnt->idx = event_to_global_idx(cnt);
>  	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) |
>  			(1ull << cnt->idx));
>  }
>  
>  static void global_disable(pmu_counter_t *cnt)
>  {
> +	if (pmu_version() < 2)
> +		return;
> +
>  	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) &
>  			~(1ull << cnt->idx));
>  }
> @@ -301,7 +306,10 @@ static void check_counter_overflow(void)
>  	count = cnt.count;
>  
>  	/* clear status before test */
> -	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
> +	if (pmu_version() > 1) {

Should be a helper to use from an earlier patch.

Hmm, looking forward, maybe have an upper level helper?  E.g.

  void pmu_clear_global_status_safe(void)
  {
	if (!exists)
		return

	wrmsr(...);
  }

Ignore this suggestion if these checks go away in the future.

> +		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> +		      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
> +	}
>  
>  	report_prefix_push("overflow");
>  
> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>  			cnt.config &= ~EVNTSEL_INT;
>  		idx = event_to_global_idx(&cnt);
>  		__measure(&cnt, cnt.count);
> -		report(cnt.count == 1, "cntr-%d", i);
> +
> +		report(check_irq() == (i % 2), "irq-%d", i);
> +		if (pmu_version() > 1)

Helper.

> +			report(cnt.count == 1, "cntr-%d", i);
> +		else
> +			report(cnt.count < 4, "cntr-%d", i);
> +
> +		if (pmu_version() < 2)

Helper.

> +			continue;
> +
>  		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>  		report(status & (1ull << idx), "status-%d", i);
>  		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, status);
>  		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>  		report(!(status & (1ull << idx)), "status clear-%d", i);
> -		report(check_irq() == (i % 2), "irq-%d", i);
>  	}
>  
>  	report_prefix_pop();
> @@ -440,8 +456,10 @@ static void check_running_counter_wrmsr(void)
>  	report(evt.count < gp_events[1].min, "cntr");
>  
>  	/* clear status before overflow test */
> -	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> -	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
> +	if (pmu_version() > 1) {

Helper.  Curly braces aren't necessary.

> +		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> +			rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
> +	}
>  
>  	start_event(&evt);
>  
> @@ -453,8 +471,11 @@ static void check_running_counter_wrmsr(void)
>  
>  	loop();
>  	stop_event(&evt);
> -	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
> -	report(status & 1, "status");
> +
> +	if (pmu_version() > 1) {

Helper.

> +		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
> +		report(status & 1, "status");

Can you opportunistically provide a better message than "status"?

> +	}
>  
>  	report_prefix_pop();
>  }
> @@ -474,8 +495,10 @@ static void check_emulated_instr(void)
>  	};
>  	report_prefix_push("emulated instruction");
>  
> -	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> -	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
> +	if (pmu_version() > 1) {

Helper, no curly braces.  Ah, IIRC, kernel perf prefers curly braces if the code
spans multiple lines.  KVM and KUT does not.

> +		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> +			rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
> +	}
>  
>  	start_event(&brnch_cnt);
>  	start_event(&instr_cnt);
> @@ -509,7 +532,8 @@ static void check_emulated_instr(void)
>  		:
>  		: "eax", "ebx", "ecx", "edx");
>  
> -	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
> +	if (pmu_version() > 1)

Helper.

> +		wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
>  
>  	stop_event(&brnch_cnt);
>  	stop_event(&instr_cnt);
> @@ -520,10 +544,13 @@ static void check_emulated_instr(void)
>  	       "instruction count");
>  	report(brnch_cnt.count - brnch_start >= EXPECTED_BRNCH,
>  	       "branch count");
> -	// Additionally check that those counters overflowed properly.
> -	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
> -	report(status & 1, "instruction counter overflow");
> -	report(status & 2, "branch counter overflow");
> +
> +	if (pmu_version() > 1) {

Helper?  E.g. if this is a "has architectural PMU".

> +		// Additionally check that those counters overflowed properly.
> +		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
> +		report(status & 1, "instruction counter overflow");
> +		report(status & 2, "branch counter overflow");
> +	}
>  
>  	report_prefix_pop();
>  }
> @@ -647,12 +674,7 @@ int main(int ac, char **av)
>  	buf = malloc(N*64);
>  
>  	if (!pmu_version()) {
> -		report_skip("No pmu is detected!");
> -		return report_summary();
> -	}
> -
> -	if (pmu_version() == 1) {
> -		report_skip("PMU version 1 is not supported.");
> +		report_skip("No Intel Arch PMU is detected!");
>  		return report_summary();
>  	}
>  
> -- 
> 2.37.2
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources Like Xu
  2022-09-06  7:19   ` Sandipan Das
@ 2022-10-05 22:44   ` Sean Christopherson
  2022-10-21  7:21     ` Like Xu
  1 sibling, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:44 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> @@ -65,7 +65,13 @@ struct pmu_event {
>  };
>  
>  #define PMU_CAP_FW_WRITES	(1ULL << 13)
> -static u64 gp_counter_base = MSR_IA32_PERFCTR0;
> +static u32 gp_counter_base;
> +static u32 gp_select_base;
> +static unsigned int gp_events_size;
> +static unsigned int nr_gp_counters;
> +
> +typedef struct pmu_event PMU_EVENTS_ARRAY_t[];
> +static PMU_EVENTS_ARRAY_t *gp_events = NULL;

There's no need for a layer of indirection, C d.  The NULL is also not strictly
required.  E.g. this should Just Work.

  static struct pmu_event *gp_events;

And then I beleve a large number of changes in this series go away.

>  char *buf;
>  
> @@ -114,9 +120,9 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
>  	if (is_gp(cnt)) {
>  		int i;
>  
> -		for (i = 0; i < sizeof(gp_events)/sizeof(gp_events[0]); i++)
> -			if (gp_events[i].unit_sel == (cnt->config & 0xffff))
> -				return &gp_events[i];
> +		for (i = 0; i < gp_events_size; i++)
> +			if ((*gp_events)[i].unit_sel == (cnt->config & 0xffff))
> +				return &(*gp_events)[i];
>  	} else
>  		return &fixed_events[cnt->ctr - MSR_CORE_PERF_FIXED_CTR0];
>  
> @@ -142,12 +148,22 @@ static void global_disable(pmu_counter_t *cnt)
>  			~(1ull << cnt->idx));
>  }
>  
> +static inline uint32_t get_gp_counter_msr(unsigned int i)

Rather than helpers, what about macros?  The problem with "get" is that it sounds
like the helper is actually reading the counter/MSR.  E.g. see MSR_IA32_MCx_CTL()

Something like this?

  MSR_PERF_GP_CTRx()

> +{
> +	return gp_counter_base + i;
> +}
> +
> +static inline uint32_t get_gp_select_msr(unsigned int i)
> +{
> +	return gp_select_base + i;

Same here, maybe?

  MSR_PERF_GP_SELECTx()


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU
  2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU Like Xu
  2022-09-06  7:32   ` Sandipan Das
@ 2022-10-05 22:48   ` Sean Christopherson
  1 sibling, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-05 22:48 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Aug 19, 2022, Like Xu wrote:
> diff --git a/lib/x86/processor.h b/lib/x86/processor.h
> index 0324220..10bca27 100644
> --- a/lib/x86/processor.h
> +++ b/lib/x86/processor.h
> @@ -793,6 +793,9 @@ static inline void flush_tlb(void)
>  
>  static inline u8 pmu_version(void)
>  {
> +	if (!is_intel())
> +		return 0;
> +
>  	return cpuid(10).a & 0xff;
>  }
>  
> @@ -806,19 +809,39 @@ static inline bool this_cpu_has_perf_global_ctrl(void)
>  	return pmu_version() > 1;
>  }
>  
> +#define AMD64_NUM_COUNTERS                             4
> +#define AMD64_NUM_COUNTERS_CORE                                6
> +
> +static inline bool has_amd_perfctr_core(void)
> +{
> +	return cpuid(0x80000001).c & BIT_ULL(23);

Add an X86_FEATURE_*, maybe X86_FEATURE_AMD_PERF_EXTENSIONS?

> +}
> +
>  static inline u8 pmu_nr_gp_counters(void)
>  {
> -	return (cpuid(10).a >> 8) & 0xff;
> +	if (is_intel()) {

No curly braces.

> +		return (cpuid(10).a >> 8) & 0xff;
> +	} else if (!has_amd_perfctr_core()) {

Drop the "else", the above "if" is terminal.

> +		return AMD64_NUM_COUNTERS;
> +	}
> +
> +	return AMD64_NUM_COUNTERS_CORE;
>  }
>  
>  static inline u8 pmu_gp_counter_width(void)
>  {
> -	return (cpuid(10).a >> 16) & 0xff;
> +	if (is_intel())
> +		return (cpuid(10).a >> 16) & 0xff;
> +	else
> +		return 48;

Please add a #define for this magic number.

>  }
>  
>  static inline u8 pmu_gp_counter_mask_length(void)
>  {
> -	return (cpuid(10).a >> 24) & 0xff;
> +	if (is_intel())
> +		return (cpuid(10).a >> 24) & 0xff;
> +	else
> +		return pmu_nr_gp_counters();
>  }
>  
>  static inline u8 pmu_nr_fixed_counters(void)
> @@ -843,6 +866,9 @@ static inline u8 pmu_fixed_counter_width(void)
>  
>  static inline bool pmu_gp_counter_is_available(int i)
>  {
> +	if (!is_intel())
> +		return i < pmu_nr_gp_counters();
> +
>  	/* CPUID.0xA.EBX bit is '1 if they counter is NOT available. */
>  	return !(cpuid(10).b & BIT(i));
>  }
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 0706cb1..b6ab10c 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -62,6 +62,11 @@ struct pmu_event {
>  	{"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
>  	{"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
>  	{"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
> +}, amd_gp_events[] = {
> +	{"core cycles", 0x0076, 1*N, 50*N},
> +	{"instructions", 0x00c0, 10*N, 10.2*N},
> +	{"branches", 0x00c2, 1*N, 1.1*N},
> +	{"branch misses", 0x00c3, 0, 0.1*N},
>  };
>  
>  #define PMU_CAP_FW_WRITES	(1ULL << 13)
> @@ -105,14 +110,24 @@ static bool check_irq(void)
>  
>  static bool is_gp(pmu_counter_t *evt)
>  {
> +	if (!is_intel())
> +		return true;
> +
>  	return evt->ctr < MSR_CORE_PERF_FIXED_CTR0 ||
>  		evt->ctr >= MSR_IA32_PMC0;
>  }
>  
>  static int event_to_global_idx(pmu_counter_t *cnt)
>  {
> -	return cnt->ctr - (is_gp(cnt) ? gp_counter_base :
> -		(MSR_CORE_PERF_FIXED_CTR0 - FIXED_CNT_INDEX));
> +	if (is_intel())
> +		return cnt->ctr - (is_gp(cnt) ? gp_counter_base :
> +			(MSR_CORE_PERF_FIXED_CTR0 - FIXED_CNT_INDEX));
> +
> +	if (gp_counter_base == MSR_F15H_PERF_CTR0) {

Unnecessary curly braces.

> +		return (cnt->ctr - gp_counter_base) / 2;
> +	} else {
> +		return cnt->ctr - gp_counter_base;
> +	}
>  }
>  
>  static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
> @@ -736,5 +783,11 @@ int main(int ac, char **av)
>  		report_prefix_pop();
>  	}
>  
> +	if (!is_intel()) {
> +		report_prefix_push("K7");
> +		amd_switch_to_non_perfctr_core();
> +		check_counters();

"K7" prefix needs to be popped.

> +	}
> +
>  	return report_summary();
>  }
> -- 
> 2.37.2
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386
  2022-10-05 22:18   ` Sean Christopherson
@ 2022-10-17  7:30     ` Like Xu
  2022-10-21 19:16       ` Sean Christopherson
  0 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-10-17  7:30 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm

On 6/10/2022 6:18 am, Sean Christopherson wrote:
> On Fri, Aug 19, 2022, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> The pmu test check_counter_overflow() always fails with the "./configure
>> --arch=i386".
> 
> Explicitly state that the failures are with 32-bit binaries.  E.g. I can and do
> run KUT in 32-bit VMs, which doesn't require the explicit --arch=i386.

True and applied.

> 
>> The cnt.count obtained from the latter run of measure()
>> (based on fixed counter 0) is not equal to the expected value (based
>> on gp counter 0) and there is a positive error with a value of 2.
>>
>> The two extra instructions come from inline wrmsr() and inline rdmsr()
>> inside the global_disable() binary code block. Specifically, for each msr
>> access, the i386 code will have two assembly mov instructions before
>> rdmsr/wrmsr (mark it for fixed counter 0, bit 32), but only one assembly
>> mov is needed for x86_64 and gp counter 0 on i386.
>>
>> Fix the expected init cnt.count for fixed counter 0 overflow based on
>> the same fixed counter 0, not always using gp counter 0.
> 
> You lost me here.  I totally understand the problem, but I don't understand the
> fix.

The sequence of instructions to count events using the #GP and #Fixed counters 
is different.
Thus the fix is quite high level, to use the same counter (w/ same instruction 
sequences) to
set initial value for the same counter.

I may add this to the commit message.

> 
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>>   x86/pmu.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/x86/pmu.c b/x86/pmu.c
>> index 45ca2c6..057fd4a 100644
>> --- a/x86/pmu.c
>> +++ b/x86/pmu.c
>> @@ -315,6 +315,9 @@ static void check_counter_overflow(void)
>>   
>>   		if (i == nr_gp_counters) {
>>   			cnt.ctr = fixed_events[0].unit_sel;
>> +			__measure(&cnt, 0);
>> +			count = cnt.count;
>> +			cnt.count = 1 - count;
> 
> This definitely needs a comment.
> 
> Dumb question time: if the count is off by 2, why can't we just subtract 2?

More low-level code (bringing in differences between the 32-bit and 64-bit runtimes)
being added would break this.

The test goal is simply to set the initial value of a counter to overflow, which 
is always
off by 1, regardless of the involved rd/wrmsr or other execution details.

> 
> #ifndef __x86_64__
> 			/* comment about extra MOV insns for RDMSR/WRMSR */
> 			cnt.count -= 2;
> #endif
> 
>>   			cnt.count &= (1ull << pmu_fixed_counter_width()) - 1;
>>   		}
>>   
>> -- 
>> 2.37.2
>>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-10-05 22:35   ` Sean Christopherson
@ 2022-10-18  9:32     ` Like Xu
  0 siblings, 0 replies; 39+ messages in thread
From: Like Xu @ 2022-10-18  9:32 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm

All applied except pmu_clear_global_status_safe(). Thanks.

On 6/10/2022 6:35 am, Sean Christopherson wrote:
> On Fri, Aug 19, 2022, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> For most unit tests, the basic framework and use cases which test
>> any PMU counter do not require any changes, except for two things:
>>
>> - No access to registers introduced only in PMU version 2 and above;
>> - Expanded tolerance for testing counter overflows
>>    due to the loss of uniform control of the gloabl_ctrl register
>>
>> Adding some pmu_version() return value checks can seamlessly support
>> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
> 
> Phrase this as a command so that it's crystal clear that this is what the patch
> does, as opposed to what the patch _can_ do.
> 
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>>   x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>>   1 file changed, 43 insertions(+), 21 deletions(-)
>>
>> diff --git a/x86/pmu.c b/x86/pmu.c
>> index 25fafbe..826472c 100644
>> --- a/x86/pmu.c
>> +++ b/x86/pmu.c
>> @@ -125,14 +125,19 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
>>   
>>   static void global_enable(pmu_counter_t *cnt)
>>   {
>> -	cnt->idx = event_to_global_idx(cnt);
>> +	if (pmu_version() < 2)
> 
> Helper please.
> 
>> +		return;
>>   
>> +	cnt->idx = event_to_global_idx(cnt);
>>   	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) |
>>   			(1ull << cnt->idx));
>>   }
>>   
>>   static void global_disable(pmu_counter_t *cnt)
>>   {
>> +	if (pmu_version() < 2)
>> +		return;
>> +
>>   	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_CTRL) &
>>   			~(1ull << cnt->idx));
>>   }
>> @@ -301,7 +306,10 @@ static void check_counter_overflow(void)
>>   	count = cnt.count;
>>   
>>   	/* clear status before test */
>> -	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
>> +	if (pmu_version() > 1) {
> 
> Should be a helper to use from an earlier patch.
> 
> Hmm, looking forward, maybe have an upper level helper?  E.g.
> 
>    void pmu_clear_global_status_safe(void)
>    {
> 	if (!exists)
> 		return
> 
> 	wrmsr(...);
>    }
> 
> Ignore this suggestion if these checks go away in the future.
> 
>> +		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>> +		      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
>> +	}
>>   
>>   	report_prefix_push("overflow");
>>   
>> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>>   			cnt.config &= ~EVNTSEL_INT;
>>   		idx = event_to_global_idx(&cnt);
>>   		__measure(&cnt, cnt.count);
>> -		report(cnt.count == 1, "cntr-%d", i);
>> +
>> +		report(check_irq() == (i % 2), "irq-%d", i);
>> +		if (pmu_version() > 1)
> 
> Helper.
> 
>> +			report(cnt.count == 1, "cntr-%d", i);
>> +		else
>> +			report(cnt.count < 4, "cntr-%d", i);
>> +
>> +		if (pmu_version() < 2)
> 
> Helper.
> 
>> +			continue;
>> +
>>   		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>>   		report(status & (1ull << idx), "status-%d", i);
>>   		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, status);
>>   		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>>   		report(!(status & (1ull << idx)), "status clear-%d", i);
>> -		report(check_irq() == (i % 2), "irq-%d", i);
>>   	}
>>   
>>   	report_prefix_pop();
>> @@ -440,8 +456,10 @@ static void check_running_counter_wrmsr(void)
>>   	report(evt.count < gp_events[1].min, "cntr");
>>   
>>   	/* clear status before overflow test */
>> -	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>> -	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
>> +	if (pmu_version() > 1) {
> 
> Helper.  Curly braces aren't necessary.
> 
>> +		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>> +			rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
>> +	}
>>   
>>   	start_event(&evt);
>>   
>> @@ -453,8 +471,11 @@ static void check_running_counter_wrmsr(void)
>>   
>>   	loop();
>>   	stop_event(&evt);
>> -	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>> -	report(status & 1, "status");
>> +
>> +	if (pmu_version() > 1) {
> 
> Helper.
> 
>> +		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>> +		report(status & 1, "status");
> 
> Can you opportunistically provide a better message than "status"?
> 
>> +	}
>>   
>>   	report_prefix_pop();
>>   }
>> @@ -474,8 +495,10 @@ static void check_emulated_instr(void)
>>   	};
>>   	report_prefix_push("emulated instruction");
>>   
>> -	wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>> -	      rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
>> +	if (pmu_version() > 1) {
> 
> Helper, no curly braces.  Ah, IIRC, kernel perf prefers curly braces if the code
> spans multiple lines.  KVM and KUT does not.

OK.

> 
>> +		wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>> +			rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
>> +	}
>>   
>>   	start_event(&brnch_cnt);
>>   	start_event(&instr_cnt);
>> @@ -509,7 +532,8 @@ static void check_emulated_instr(void)
>>   		:
>>   		: "eax", "ebx", "ecx", "edx");
>>   
>> -	wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
>> +	if (pmu_version() > 1)
> 
> Helper.
> 
>> +		wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
>>   
>>   	stop_event(&brnch_cnt);
>>   	stop_event(&instr_cnt);
>> @@ -520,10 +544,13 @@ static void check_emulated_instr(void)
>>   	       "instruction count");
>>   	report(brnch_cnt.count - brnch_start >= EXPECTED_BRNCH,
>>   	       "branch count");
>> -	// Additionally check that those counters overflowed properly.
>> -	status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>> -	report(status & 1, "instruction counter overflow");
>> -	report(status & 2, "branch counter overflow");
>> +
>> +	if (pmu_version() > 1) {
> 
> Helper?  E.g. if this is a "has architectural PMU".

Nit, "intel arch pmu" means "version > 0".

> 
>> +		// Additionally check that those counters overflowed properly.
>> +		status = rdmsr(MSR_CORE_PERF_GLOBAL_STATUS);
>> +		report(status & 1, "instruction counter overflow");
>> +		report(status & 2, "branch counter overflow");
>> +	}
>>   
>>   	report_prefix_pop();
>>   }
>> @@ -647,12 +674,7 @@ int main(int ac, char **av)
>>   	buf = malloc(N*64);
>>   
>>   	if (!pmu_version()) {
>> -		report_skip("No pmu is detected!");
>> -		return report_summary();
>> -	}
>> -
>> -	if (pmu_version() == 1) {
>> -		report_skip("PMU version 1 is not supported.");
>> +		report_skip("No Intel Arch PMU is detected!");
>>   		return report_summary();
>>   	}
>>   
>> -- 
>> 2.37.2
>>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources
  2022-10-05 22:44   ` Sean Christopherson
@ 2022-10-21  7:21     ` Like Xu
  2022-10-21 18:22       ` Sean Christopherson
  0 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-10-21  7:21 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm

On 6/10/2022 6:44 am, Sean Christopherson wrote:
> On Fri, Aug 19, 2022, Like Xu wrote:
>> @@ -65,7 +65,13 @@ struct pmu_event {
>>   };
>>   
>>   #define PMU_CAP_FW_WRITES	(1ULL << 13)
>> -static u64 gp_counter_base = MSR_IA32_PERFCTR0;
>> +static u32 gp_counter_base;
>> +static u32 gp_select_base;
>> +static unsigned int gp_events_size;
>> +static unsigned int nr_gp_counters;
>> +
>> +typedef struct pmu_event PMU_EVENTS_ARRAY_t[];
>> +static PMU_EVENTS_ARRAY_t *gp_events = NULL;
> 
> There's no need for a layer of indirection, C d.  The NULL is also not strictly
> required.  E.g. this should Just Work.
> 
>    static struct pmu_event *gp_events;

Applied.

> 
> And then I beleve a large number of changes in this series go away.
> 
>>   char *buf;
>>   
>> @@ -114,9 +120,9 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
>>   	if (is_gp(cnt)) {
>>   		int i;
>>   
>> -		for (i = 0; i < sizeof(gp_events)/sizeof(gp_events[0]); i++)
>> -			if (gp_events[i].unit_sel == (cnt->config & 0xffff))
>> -				return &gp_events[i];
>> +		for (i = 0; i < gp_events_size; i++)
>> +			if ((*gp_events)[i].unit_sel == (cnt->config & 0xffff))
>> +				return &(*gp_events)[i];
>>   	} else
>>   		return &fixed_events[cnt->ctr - MSR_CORE_PERF_FIXED_CTR0];
>>   
>> @@ -142,12 +148,22 @@ static void global_disable(pmu_counter_t *cnt)
>>   			~(1ull << cnt->idx));
>>   }
>>   
>> +static inline uint32_t get_gp_counter_msr(unsigned int i)
> 
> Rather than helpers, what about macros?  The problem with "get" is that it sounds
> like the helper is actually reading the counter/MSR.  E.g. see MSR_IA32_MCx_CTL()
> 
> Something like this?
> 
>    MSR_PERF_GP_CTRx()

The base address msr is different for intel and amd (including K7), and using 
different macros
in the same location sounds like it would require a helper.

This proposal will be dropped or need to be re-emphasised in the next version.

> 
>> +{
>> +	return gp_counter_base + i;
>> +}
>> +
>> +static inline uint32_t get_gp_select_msr(unsigned int i)
>> +{
>> +	return gp_select_base + i;
> 
> Same here, maybe?
> 
>    MSR_PERF_GP_SELECTx()
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-09-19  7:09         ` Like Xu
@ 2022-10-21  7:32           ` Like Xu
  2022-10-25  8:34             ` Sandipan Das
  0 siblings, 1 reply; 39+ messages in thread
From: Like Xu @ 2022-10-21  7:32 UTC (permalink / raw)
  To: Sandipan Das; +Cc: kvm, Sean Christopherson, Paolo Bonzini, Jim Mattson

Hi Sandipan,

On 19/9/2022 3:09 pm, Like Xu wrote:
> On 8/9/2022 4:23 pm, Sandipan Das wrote:
>> On 9/6/2022 7:05 PM, Like Xu wrote:
>>> On 6/9/2022 4:16 pm, Sandipan Das wrote:
>>>> Hi Like,
>>>>
>>>> On 8/19/2022 4:39 PM, Like Xu wrote:
>>>>> From: Like Xu <likexu@tencent.com>
>>>>>
>>>>> For most unit tests, the basic framework and use cases which test
>>>>> any PMU counter do not require any changes, except for two things:
>>>>>
>>>>> - No access to registers introduced only in PMU version 2 and above;
>>>>> - Expanded tolerance for testing counter overflows
>>>>>     due to the loss of uniform control of the gloabl_ctrl register
>>>>>
>>>>> Adding some pmu_version() return value checks can seamlessly support
>>>>> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
>>>>>
>>>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>>>> ---
>>>>>    x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>>>>>    1 file changed, 43 insertions(+), 21 deletions(-)
>>>>>
>>>>> [...]
>>>>> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>>>>>                cnt.config &= ~EVNTSEL_INT;
>>>>>            idx = event_to_global_idx(&cnt);
>>>>>            __measure(&cnt, cnt.count);
>>>>> -        report(cnt.count == 1, "cntr-%d", i);
>>>>> +
>>>>> +        report(check_irq() == (i % 2), "irq-%d", i);
>>>>> +        if (pmu_version() > 1)
>>>>> +            report(cnt.count == 1, "cntr-%d", i);
>>>>> +        else
>>>>> +            report(cnt.count < 4, "cntr-%d", i);
>>>>> +
>>>>> [...]
>>>>
>>>> Sorry I missed this in the previous response. With an upper bound of
>>>> 4, I see this test failing some times for at least one of the six
>>>> counters (with NMI watchdog disabled on the host) on a Milan (Zen 3)
>>>> system. Increasing it further does reduce the probability but I still
>>>> see failures. Do you see the same behaviour on systems with Zen 3 and
>>>> older processors?
>>>
>>> A hundred runs on my machine did not report a failure.
>>>
>>
>> Was this on a Zen 4 system?
>>
>>> But I'm not surprised by this, because some AMD platforms do
>>> have hw PMU errata which requires bios or ucode fixes.
>>>
>>> Please help find the right upper bound for all your available AMD boxes.
>>>
>>
>> Even after updating the microcode, the tests failed just as often in an
>> overnight loop. However, upon closer inspection, the reason for failure
>> was different. The variance is well within the bounds now but sometimes,
>> is_the_count_reproducible() is true. Since this selects the original
>> verification criteria (cnt.count == 1), the tests fail.
>>
>>> What makes me most nervous is that AMD's core hardware events run
>>> repeatedly against the same workload, and their count results are erratic.
>>>
>>
>> With that in mind, should we consider having the following change?
>>
>> diff --git a/x86/pmu.c b/x86/pmu.c
>> index bb16b3c..39979b8 100644
>> --- a/x86/pmu.c
>> +++ b/x86/pmu.c
>> @@ -352,7 +352,7 @@ static void check_counter_overflow(void)
>>                  .ctr = gp_counter_base,
>>                  .config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[1].unit_sel 
>> /* instructions */,
>>          };
>> -       bool precise_event = is_the_count_reproducible(&cnt);
>> +       bool precise_event = is_intel() ? is_the_count_reproducible(&cnt) : 
>> false;
>>
>>          __measure(&cnt, 0);
>>          count = cnt.count;
>>
>> With this, the tests always pass. I will run another overnight loop and
>> report back if I see any errors.
>>
>>> You may check is_the_count_reproducible() in the test case:
>>> [1]https://lore.kernel.org/kvm/20220905123946.95223-7-likexu@tencent.com/
>>
>> On Zen 4 systems, this is always false and the overflow tests always
>> pass irrespective of whether PerfMonV2 is enabled for the guest or not.
>>
>> - Sandipan
> 
> I could change it to:
> 
>          if (is_intel())
>              report(cnt.count == 1, "cntr-%d", i);
>          else
>              report(cnt.count < 4, "cntr-%d", i);

On AMD (zen3/zen4) machines this seems to be the only way to ensure that the 
test cases don't fail:

		if (is_intel())
			report(cnt.count == 1, "cntr-%d", i);
		else
			report(cnt.count == 0xffffffffffff || cnt.count < 7, "cntr-%d", i);

but it means some hardware counter defects, can you further confirm that this 
hardware behaviour
is in line with your expectations ?

> 
> but this does not explain the difference, that is for the same workload:
> 
> if a retired hw event like "PMCx0C0 [Retired Instructions] (ExRetInstr)" is 
> configured,
> then it's expected to count "the number of instructions retired", the value is 
> only relevant
> for workload and it should remain the same over multiple measurements,
> 
> but there are two hardware counters, one AMD and one Intel, both are reset to an 
> identical value
> (like "cnt.count = 1 - count"), and when they overflow, the Intel counter can 
> stay exactly at 1,
> while the AMD counter cannot.
> 
> I know there are ulterior hardware micro-arch implementation differences here,
> but what AMD is doing violates the semantics of "retired".
> 
> Is this behavior normal by design ?
> I'm not sure what I'm missing, this behavior is reinforced in zen4 as you said.
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources
  2022-10-21  7:21     ` Like Xu
@ 2022-10-21 18:22       ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-21 18:22 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Fri, Oct 21, 2022, Like Xu wrote:
> On 6/10/2022 6:44 am, Sean Christopherson wrote:
> > On Fri, Aug 19, 2022, Like Xu wrote:
> > > @@ -142,12 +148,22 @@ static void global_disable(pmu_counter_t *cnt)
> > >   			~(1ull << cnt->idx));
> > >   }
> > > +static inline uint32_t get_gp_counter_msr(unsigned int i)
> > 
> > Rather than helpers, what about macros?  The problem with "get" is that it sounds
> > like the helper is actually reading the counter/MSR.  E.g. see MSR_IA32_MCx_CTL()
> > 
> > Something like this?
> > 
> >    MSR_PERF_GP_CTRx()
> 
> The base address msr is different for intel and amd (including K7), and
> using different macros in the same location sounds like it would require a
> helper.

I wasn't thinking different macros, I was thinking a macro that consumes the
gp_counter_base.

But actually, there's no need to use a macro, it's really just the naming
convention that I think we should use.  It obviously violates the preferred style,
but in this case I think the deviation is a net postive.

static inline uint32_t MSR_PERF_GP_CTRx(unsigned int i)
{
	return gp_counter_base + i;
}

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386
  2022-10-17  7:30     ` Like Xu
@ 2022-10-21 19:16       ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2022-10-21 19:16 UTC (permalink / raw)
  To: Like Xu; +Cc: Paolo Bonzini, kvm

On Mon, Oct 17, 2022, Like Xu wrote:
> On 6/10/2022 6:18 am, Sean Christopherson wrote:
> > > ---
> > >   x86/pmu.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/x86/pmu.c b/x86/pmu.c
> > > index 45ca2c6..057fd4a 100644
> > > --- a/x86/pmu.c
> > > +++ b/x86/pmu.c
> > > @@ -315,6 +315,9 @@ static void check_counter_overflow(void)
> > >   		if (i == nr_gp_counters) {
> > >   			cnt.ctr = fixed_events[0].unit_sel;
> > > +			__measure(&cnt, 0);
> > > +			count = cnt.count;
> > > +			cnt.count = 1 - count;
> > 
> > This definitely needs a comment.
> > 
> > Dumb question time: if the count is off by 2, why can't we just subtract 2?
> 
> More low-level code (bringing in differences between the 32-bit and 64-bit runtimes)
> being added would break this.
> 
> The test goal is simply to set the initial value of a counter to overflow,
> which is always off by 1, regardless of the involved rd/wrmsr or other
> execution details.

Oooh, I see what this code is doing.  But wouldn't it be better to offset from '0'?
E.g. if the measured workload is a single instruction, then the measured count
will be '1' and thus "1 - count" will be zero, meaning no overflow will occur.

Ah, but as per the SDM, the "+1" is needed to ensure the overflow is detected
immediately.

  Here, however, if an interrupt is to be generated after 100 event counts, the
  counter should be preset to minus 100 plus 1 (-100 + 1), or -99. The counter
  will then overflow after it counts 99 events and generate an interrupt on the
  next (100th) event counted. The difference of 1 for this count enables the
  interrupt to be generated immediately after the selected event count has been
  reached, instead of waiting for the overflow to be propagation through the
  counter.

What about adding a helper to measure/compute the overflow preset value?  That
would provide a convenient location to document the (IMO) weird behavior that's
necessary to ensure immediate event delivery.  E.g.

---
 x86/pmu.c | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index f891053f..a38ae3f6 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -325,16 +325,30 @@ static void check_counters_many(void)
 	report(i == n, "all counters");
 }
 
-static void check_counter_overflow(void)
+static uint64_t measure_for_overflow(pmu_counter_t *cnt)
 {
-	uint64_t count;
-	int i;
-	pmu_counter_t cnt = {
-		.ctr = gp_counter_base,
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
-	};
 	__measure(&cnt, 0);
-	count = cnt.count;
+
+	/*
+	 * To generate overflow, i.e. roll over to '0', the initial count just
+	 * needs to be preset to the negative expected count.  However, as per
+	 * Intel's SDM, the preset count needs to be incremented by 1 to ensure
+	 * the overflow interrupt is generated immediately instead of possibly
+	 * waiting for the overflow to propagate through the counter.
+	 */
+	assert(cnt.count > 1);
+	return 1 - cnt.count;
+}
+
+static void check_counter_overflow(void)
+{
+	uint64_t overflow_preset;
+	int i;
+	pmu_counter_t cnt = {
+		.ctr = gp_counter_base,
+		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
+	};
+	overflow_preset = measure_for_overflow(&cnt);
 
 	/* clear status before test */
 	if (pmu_version() > 1) {
@@ -349,7 +363,7 @@ static void check_counter_overflow(void)
 		int idx;
 
 		cnt.ctr = get_gp_counter_msr(i);
-		cnt.count = 1 - count;
+		cnt.count = overflow_preset;
 		if (gp_counter_base == MSR_IA32_PMC0)
 			cnt.count &= (1ull << pmu_gp_counter_width()) - 1;
 
@@ -358,9 +372,7 @@ static void check_counter_overflow(void)
 				break;
 
 			cnt.ctr = fixed_events[0].unit_sel;
-			__measure(&cnt, 0);
-			count = cnt.count;
-			cnt.count = 1 - count;
+			cnt.count = measure_for_overflow(&cnt);
 			cnt.count &= (1ull << pmu_fixed_counter_width()) - 1;
 		}
 

base-commit: c3e384a2268baed99d4b59dd239c98bd6a5471eb
-- 


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1
  2022-10-21  7:32           ` Like Xu
@ 2022-10-25  8:34             ` Sandipan Das
  0 siblings, 0 replies; 39+ messages in thread
From: Sandipan Das @ 2022-10-25  8:34 UTC (permalink / raw)
  To: Like Xu; +Cc: kvm, Sean Christopherson, Paolo Bonzini, Jim Mattson

Hi Like,

On 10/21/2022 1:02 PM, Like Xu wrote:
> Hi Sandipan,
> 
> On 19/9/2022 3:09 pm, Like Xu wrote:
>> On 8/9/2022 4:23 pm, Sandipan Das wrote:
>>> On 9/6/2022 7:05 PM, Like Xu wrote:
>>>> On 6/9/2022 4:16 pm, Sandipan Das wrote:
>>>>> Hi Like,
>>>>>
>>>>> On 8/19/2022 4:39 PM, Like Xu wrote:
>>>>>> From: Like Xu <likexu@tencent.com>
>>>>>>
>>>>>> For most unit tests, the basic framework and use cases which test
>>>>>> any PMU counter do not require any changes, except for two things:
>>>>>>
>>>>>> - No access to registers introduced only in PMU version 2 and above;
>>>>>> - Expanded tolerance for testing counter overflows
>>>>>>     due to the loss of uniform control of the gloabl_ctrl register
>>>>>>
>>>>>> Adding some pmu_version() return value checks can seamlessly support
>>>>>> Intel Arch PMU Version 1, while opening the door for AMD PMUs tests.
>>>>>>
>>>>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>>>>> ---
>>>>>>    x86/pmu.c | 64 +++++++++++++++++++++++++++++++++++++------------------
>>>>>>    1 file changed, 43 insertions(+), 21 deletions(-)
>>>>>>
>>>>>> [...]
>>>>>> @@ -327,13 +335,21 @@ static void check_counter_overflow(void)
>>>>>>                cnt.config &= ~EVNTSEL_INT;
>>>>>>            idx = event_to_global_idx(&cnt);
>>>>>>            __measure(&cnt, cnt.count);
>>>>>> -        report(cnt.count == 1, "cntr-%d", i);
>>>>>> +
>>>>>> +        report(check_irq() == (i % 2), "irq-%d", i);
>>>>>> +        if (pmu_version() > 1)
>>>>>> +            report(cnt.count == 1, "cntr-%d", i);
>>>>>> +        else
>>>>>> +            report(cnt.count < 4, "cntr-%d", i);
>>>>>> +
>>>>>> [...]
>>>>>
>>>>> Sorry I missed this in the previous response. With an upper bound of
>>>>> 4, I see this test failing some times for at least one of the six
>>>>> counters (with NMI watchdog disabled on the host) on a Milan (Zen 3)
>>>>> system. Increasing it further does reduce the probability but I still
>>>>> see failures. Do you see the same behaviour on systems with Zen 3 and
>>>>> older processors?
>>>>
>>>> A hundred runs on my machine did not report a failure.
>>>>
>>>
>>> Was this on a Zen 4 system?
>>>
>>>> But I'm not surprised by this, because some AMD platforms do
>>>> have hw PMU errata which requires bios or ucode fixes.
>>>>
>>>> Please help find the right upper bound for all your available AMD boxes.
>>>>
>>>
>>> Even after updating the microcode, the tests failed just as often in an
>>> overnight loop. However, upon closer inspection, the reason for failure
>>> was different. The variance is well within the bounds now but sometimes,
>>> is_the_count_reproducible() is true. Since this selects the original
>>> verification criteria (cnt.count == 1), the tests fail.
>>>
>>>> What makes me most nervous is that AMD's core hardware events run
>>>> repeatedly against the same workload, and their count results are erratic.
>>>>
>>>
>>> With that in mind, should we consider having the following change?
>>>
>>> diff --git a/x86/pmu.c b/x86/pmu.c
>>> index bb16b3c..39979b8 100644
>>> --- a/x86/pmu.c
>>> +++ b/x86/pmu.c
>>> @@ -352,7 +352,7 @@ static void check_counter_overflow(void)
>>>                  .ctr = gp_counter_base,
>>>                  .config = EVNTSEL_OS | EVNTSEL_USR | (*gp_events)[1].unit_sel /* instructions */,
>>>          };
>>> -       bool precise_event = is_the_count_reproducible(&cnt);
>>> +       bool precise_event = is_intel() ? is_the_count_reproducible(&cnt) : false;
>>>
>>>          __measure(&cnt, 0);
>>>          count = cnt.count;
>>>
>>> With this, the tests always pass. I will run another overnight loop and
>>> report back if I see any errors.
>>>
>>>> You may check is_the_count_reproducible() in the test case:
>>>> [1]https://lore.kernel.org/kvm/20220905123946.95223-7-likexu@tencent.com/>>>>
>>> On Zen 4 systems, this is always false and the overflow tests always
>>> pass irrespective of whether PerfMonV2 is enabled for the guest or not.
>>>
>>> - Sandipan
>>
>> I could change it to:
>>
>>          if (is_intel())
>>              report(cnt.count == 1, "cntr-%d", i);
>>          else
>>              report(cnt.count < 4, "cntr-%d", i);
> 
> On AMD (zen3/zen4) machines this seems to be the only way to ensure that the test cases don't fail:
> 
>         if (is_intel())
>             report(cnt.count == 1, "cntr-%d", i);
>         else
>             report(cnt.count == 0xffffffffffff || cnt.count < 7, "cntr-%d", i);
> 
> but it means some hardware counter defects, can you further confirm that this hardware behaviour
> is in line with your expectations ?
> 

I am yet to investigate as to why there would a variance in count but with this updated
test condition, I can confirm that the tests pass on all my systems.

- Sandipan

>>
>> but this does not explain the difference, that is for the same workload:
>>
>> if a retired hw event like "PMCx0C0 [Retired Instructions] (ExRetInstr)" is configured,
>> then it's expected to count "the number of instructions retired", the value is only relevant
>> for workload and it should remain the same over multiple measurements,
>>
>> but there are two hardware counters, one AMD and one Intel, both are reset to an identical value
>> (like "cnt.count = 1 - count"), and when they overflow, the Intel counter can stay exactly at 1,
>> while the AMD counter cannot.
>>
>> I know there are ulterior hardware micro-arch implementation differences here,
>> but what AMD is doing violates the semantics of "retired".
>>
>> Is this behavior normal by design ?
>> I'm not sure what I'm missing, this behavior is reinforced in zen4 as you said.
>>


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2022-10-25  8:35 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-19 11:09 [kvm-unit-tests PATCH v3 00/13] x86/pmu: Test case optimization, fixes and additions Like Xu
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 01/13] x86/pmu: Introduce __start_event() to drop all of the manual zeroing Like Xu
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 02/13] x86/pmu: Introduce multiple_{one, many}() to improve readability Like Xu
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386 Like Xu
2022-10-05 22:18   ` Sean Christopherson
2022-10-17  7:30     ` Like Xu
2022-10-21 19:16       ` Sean Christopherson
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 04/13] x86/pmu: Add tests for Intel Processor Event Based Sampling (PEBS) Like Xu
2022-10-05 22:21   ` Sean Christopherson
2022-10-05 22:22     ` Sean Christopherson
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 05/13] x86: create pmu group for quick pmu-scope testing Like Xu
2022-10-05 22:23   ` Sean Christopherson
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 06/13] x86/pmu: Test emulation instructions on full-width counters Like Xu
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 07/13] x86/pmu: Pop up FW prefix to avoid out-of-context propagation Like Xu
2022-10-05 22:25   ` Sean Christopherson
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 08/13] x86/pmu: Add PDCM check before accessing PERF_CAP register Like Xu
2022-10-05 22:28   ` Sean Christopherson
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 09/13] x86/pmu: Report SKIP when testing Intel LBR on AMD platforms Like Xu
2022-10-05 22:29   ` Sean Christopherson
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 10/13] x86/pmu: Update testcases to cover Intel Arch PMU Version 1 Like Xu
2022-09-06  7:15   ` Sandipan Das
2022-09-06 13:28     ` Like Xu
2022-09-06  8:16   ` Sandipan Das
2022-09-06 13:35     ` Like Xu
2022-09-08  8:23       ` Sandipan Das
2022-09-19  7:09         ` Like Xu
2022-10-21  7:32           ` Like Xu
2022-10-25  8:34             ` Sandipan Das
2022-10-05 22:35   ` Sean Christopherson
2022-10-18  9:32     ` Like Xu
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 11/13] x86/pmu: Refine message when testing PMU on AMD platforms Like Xu
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 12/13] x86/pmu: Add assignment framework for Intel-specific HW resources Like Xu
2022-09-06  7:19   ` Sandipan Das
2022-10-05 22:44   ` Sean Christopherson
2022-10-21  7:21     ` Like Xu
2022-10-21 18:22       ` Sean Christopherson
2022-08-19 11:09 ` [kvm-unit-tests PATCH v3 13/13] x86/pmu: Update testcases to cover AMD PMU Like Xu
2022-09-06  7:32   ` Sandipan Das
2022-10-05 22:48   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).