Re: [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386

From: Sean Christopherson <seanjc@google.com>
To: Like Xu <like.xu.linux@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, kvm@vger.kernel.org
Subject: Re: [kvm-unit-tests PATCH v3 03/13] x86/pmu: Reset the expected count of the fixed counter 0 when i386
Date: Fri, 21 Oct 2022 19:16:50 +0000	[thread overview]
Message-ID: <Y1LwIgLp3gunyZ/j@google.com> (raw)
In-Reply-To: <2401d7da-9c71-4472-10b7-92f0a479ad50@gmail.com>

On Mon, Oct 17, 2022, Like Xu wrote:
> On 6/10/2022 6:18 am, Sean Christopherson wrote:
> > > ---
> > >   x86/pmu.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/x86/pmu.c b/x86/pmu.c
> > > index 45ca2c6..057fd4a 100644
> > > --- a/x86/pmu.c
> > > +++ b/x86/pmu.c
> > > @@ -315,6 +315,9 @@ static void check_counter_overflow(void)
> > >   		if (i == nr_gp_counters) {
> > >   			cnt.ctr = fixed_events[0].unit_sel;
> > > +			__measure(&cnt, 0);
> > > +			count = cnt.count;
> > > +			cnt.count = 1 - count;
> > 
> > This definitely needs a comment.
> > 
> > Dumb question time: if the count is off by 2, why can't we just subtract 2?
> 
> More low-level code (bringing in differences between the 32-bit and 64-bit runtimes)
> being added would break this.
> 
> The test goal is simply to set the initial value of a counter to overflow,
> which is always off by 1, regardless of the involved rd/wrmsr or other
> execution details.

Oooh, I see what this code is doing.  But wouldn't it be better to offset from '0'?
E.g. if the measured workload is a single instruction, then the measured count
will be '1' and thus "1 - count" will be zero, meaning no overflow will occur.

Ah, but as per the SDM, the "+1" is needed to ensure the overflow is detected
immediately.

  Here, however, if an interrupt is to be generated after 100 event counts, the
  counter should be preset to minus 100 plus 1 (-100 + 1), or -99. The counter
  will then overflow after it counts 99 events and generate an interrupt on the
  next (100th) event counted. The difference of 1 for this count enables the
  interrupt to be generated immediately after the selected event count has been
  reached, instead of waiting for the overflow to be propagation through the
  counter.

What about adding a helper to measure/compute the overflow preset value?  That
would provide a convenient location to document the (IMO) weird behavior that's
necessary to ensure immediate event delivery.  E.g.

---
 x86/pmu.c | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index f891053f..a38ae3f6 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -325,16 +325,30 @@ static void check_counters_many(void)
 	report(i == n, "all counters");
 }
 
-static void check_counter_overflow(void)
+static uint64_t measure_for_overflow(pmu_counter_t *cnt)
 {
-	uint64_t count;
-	int i;
-	pmu_counter_t cnt = {
-		.ctr = gp_counter_base,
-		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
-	};
 	__measure(&cnt, 0);
-	count = cnt.count;
+
+	/*
+	 * To generate overflow, i.e. roll over to '0', the initial count just
+	 * needs to be preset to the negative expected count.  However, as per
+	 * Intel's SDM, the preset count needs to be incremented by 1 to ensure
+	 * the overflow interrupt is generated immediately instead of possibly
+	 * waiting for the overflow to propagate through the counter.
+	 */
+	assert(cnt.count > 1);
+	return 1 - cnt.count;
+}
+
+static void check_counter_overflow(void)
+{
+	uint64_t overflow_preset;
+	int i;
+	pmu_counter_t cnt = {
+		.ctr = gp_counter_base,
+		.config = EVNTSEL_OS | EVNTSEL_USR | gp_events[1].unit_sel /* instructions */,
+	};
+	overflow_preset = measure_for_overflow(&cnt);
 
 	/* clear status before test */
 	if (pmu_version() > 1) {
@@ -349,7 +363,7 @@ static void check_counter_overflow(void)
 		int idx;
 
 		cnt.ctr = get_gp_counter_msr(i);
-		cnt.count = 1 - count;
+		cnt.count = overflow_preset;
 		if (gp_counter_base == MSR_IA32_PMC0)
 			cnt.count &= (1ull << pmu_gp_counter_width()) - 1;
 
@@ -358,9 +372,7 @@ static void check_counter_overflow(void)
 				break;
 
 			cnt.ctr = fixed_events[0].unit_sel;
-			__measure(&cnt, 0);
-			count = cnt.count;
-			cnt.count = 1 - count;
+			cnt.count = measure_for_overflow(&cnt);
 			cnt.count &= (1ull << pmu_fixed_counter_width()) - 1;
 		}
 

base-commit: c3e384a2268baed99d4b59dd239c98bd6a5471eb
--