KVM ARM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/5] KVM: arm64: Assorted PMU emulation fixes
@ 2019-10-08 16:01 Marc Zyngier
  2019-10-08 16:01 ` [PATCH v2 1/5] KVM: arm64: pmu: Fix cycle counter truncation Marc Zyngier
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-10-08 16:01 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Will Deacon

I recently came across a number of PMU emulation bugs, all which can
result in unexpected behaviours in an unsuspecting guest. The first
two patches already have been discussed on the list, but I'm including
them here as part of a slightly longer series.

The third patch is new as of v2, and fixes a bug preventing chained
events from ever being used.

The fourth patch is also new as of v2, and is an arm64 PMU change for
which I clearly don't know what I'm doing. I'd appreciate some
guidance from Will or Mark.

The last patch fixes an issue that has been here from day one, where
we confuse architectural overflow of a counter and perf sampling
period, and uses patch #4 to fix the issue.

I'l planning to send patches 1 through to 3 as fixes shortly, but I
expect the last two patches to require more discussions.

Marc Zyngier (5):
  KVM: arm64: pmu: Fix cycle counter truncation
  arm64: KVM: Handle PMCR_EL0.LC as RES1 on pure AArch64 systems
  KVM: arm64: pmu: Set the CHAINED attribute before creating the
    in-kernel event
  arm64: perf: Add reload-on-overflow capability
  KVM: arm64: pmu: Reset sample period on overflow handling

 arch/arm64/include/asm/perf_event.h |  4 +++
 arch/arm64/kernel/perf_event.c      |  8 ++++-
 arch/arm64/kvm/sys_regs.c           |  4 +++
 virt/kvm/arm/pmu.c                  | 45 +++++++++++++++++++----------
 4 files changed, 45 insertions(+), 16 deletions(-)

-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/5] KVM: arm64: pmu: Fix cycle counter truncation
  2019-10-08 16:01 [PATCH v2 0/5] KVM: arm64: Assorted PMU emulation fixes Marc Zyngier
@ 2019-10-08 16:01 ` Marc Zyngier
  2019-10-08 16:01 ` [PATCH v2 2/5] arm64: KVM: Handle PMCR_EL0.LC as RES1 on pure AArch64 systems Marc Zyngier
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-10-08 16:01 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Will Deacon

When a counter is disabled, its value is sampled before the event
is being disabled, and the value written back in the shadow register.

In that process, the value gets truncated to 32bit, which is adequate
for any counter but the cycle counter (defined as a 64bit counter).

This obviously results in a corrupted counter, and things like
"perf record -e cycles" not working at all when run in a guest...
A similar, but less critical bug exists in kvm_pmu_get_counter_value.

Make the truncation conditional on the counter not being the cycle
counter, which results in a minor code reorganisation.

Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Reported-by: Julien Thierry <julien.thierry.kdev@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 virt/kvm/arm/pmu.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 362a01886bab..c30c3a74fc7f 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -146,8 +146,7 @@ u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx)
 	if (kvm_pmu_pmc_is_chained(pmc) &&
 	    kvm_pmu_idx_is_high_counter(select_idx))
 		counter = upper_32_bits(counter);
-
-	else if (!kvm_pmu_idx_is_64bit(vcpu, select_idx))
+	else if (select_idx != ARMV8_PMU_CYCLE_IDX)
 		counter = lower_32_bits(counter);
 
 	return counter;
@@ -193,7 +192,7 @@ static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc)
  */
 static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc)
 {
-	u64 counter, reg;
+	u64 counter, reg, val;
 
 	pmc = kvm_pmu_get_canonical_pmc(pmc);
 	if (!pmc->perf_event)
@@ -201,16 +200,19 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc)
 
 	counter = kvm_pmu_get_pair_counter_value(vcpu, pmc);
 
-	if (kvm_pmu_pmc_is_chained(pmc)) {
-		reg = PMEVCNTR0_EL0 + pmc->idx;
-		__vcpu_sys_reg(vcpu, reg) = lower_32_bits(counter);
-		__vcpu_sys_reg(vcpu, reg + 1) = upper_32_bits(counter);
+	if (pmc->idx == ARMV8_PMU_CYCLE_IDX) {
+		reg = PMCCNTR_EL0;
+		val = counter;
 	} else {
-		reg = (pmc->idx == ARMV8_PMU_CYCLE_IDX)
-		       ? PMCCNTR_EL0 : PMEVCNTR0_EL0 + pmc->idx;
-		__vcpu_sys_reg(vcpu, reg) = lower_32_bits(counter);
+		reg = PMEVCNTR0_EL0 + pmc->idx;
+		val = lower_32_bits(counter);
 	}
 
+	__vcpu_sys_reg(vcpu, reg) = val;
+
+	if (kvm_pmu_pmc_is_chained(pmc))
+		__vcpu_sys_reg(vcpu, reg + 1) = upper_32_bits(counter);
+
 	kvm_pmu_release_perf_event(pmc);
 }
 
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 2/5] arm64: KVM: Handle PMCR_EL0.LC as RES1 on pure AArch64 systems
  2019-10-08 16:01 [PATCH v2 0/5] KVM: arm64: Assorted PMU emulation fixes Marc Zyngier
  2019-10-08 16:01 ` [PATCH v2 1/5] KVM: arm64: pmu: Fix cycle counter truncation Marc Zyngier
@ 2019-10-08 16:01 ` Marc Zyngier
  2019-10-08 16:01 ` [PATCH v2 3/5] KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel event Marc Zyngier
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-10-08 16:01 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Will Deacon

Of PMCR_EL0.LC, the ARMv8 ARM says:

	"In an AArch64 only implementation, this field is RES 1."

So be it.

Fixes: ab9468340d2bc ("arm64: KVM: Add access handler for PMCR register")
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/sys_regs.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2071260a275b..46822afc57e0 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -632,6 +632,8 @@ static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 	 */
 	val = ((pmcr & ~ARMV8_PMU_PMCR_MASK)
 	       | (ARMV8_PMU_PMCR_MASK & 0xdecafbad)) & (~ARMV8_PMU_PMCR_E);
+	if (!system_supports_32bit_el0())
+		val |= ARMV8_PMU_PMCR_LC;
 	__vcpu_sys_reg(vcpu, r->reg) = val;
 }
 
@@ -682,6 +684,8 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 		val = __vcpu_sys_reg(vcpu, PMCR_EL0);
 		val &= ~ARMV8_PMU_PMCR_MASK;
 		val |= p->regval & ARMV8_PMU_PMCR_MASK;
+		if (!system_supports_32bit_el0())
+			val |= ARMV8_PMU_PMCR_LC;
 		__vcpu_sys_reg(vcpu, PMCR_EL0) = val;
 		kvm_pmu_handle_pmcr(vcpu, val);
 		kvm_vcpu_pmu_restore_guest(vcpu);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 3/5] KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel event
  2019-10-08 16:01 [PATCH v2 0/5] KVM: arm64: Assorted PMU emulation fixes Marc Zyngier
  2019-10-08 16:01 ` [PATCH v2 1/5] KVM: arm64: pmu: Fix cycle counter truncation Marc Zyngier
  2019-10-08 16:01 ` [PATCH v2 2/5] arm64: KVM: Handle PMCR_EL0.LC as RES1 on pure AArch64 systems Marc Zyngier
@ 2019-10-08 16:01 ` Marc Zyngier
  2019-10-08 19:22   ` Andrew Murray
  2019-10-08 16:01 ` [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability Marc Zyngier
  2019-10-08 16:01 ` [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling Marc Zyngier
  4 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2019-10-08 16:01 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Will Deacon

The current convention for KVM to request a chained event from the
host PMU is to set bit[0] in attr.config1 (PERF_ATTR_CFG1_KVM_PMU_CHAINED).

But as it turns out, this bit gets set *after* we create the kernel
event that backs our virtual counter, meaning that we never get
a 64bit counter.

Moving the setting to an earlier point solves the problem.

Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 virt/kvm/arm/pmu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index c30c3a74fc7f..f291d4ac3519 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -569,12 +569,12 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 		 * high counter.
 		 */
 		attr.sample_period = (-counter) & GENMASK(63, 0);
+		if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
+			attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
+
 		event = perf_event_create_kernel_counter(&attr, -1, current,
 							 kvm_pmu_perf_overflow,
 							 pmc + 1);
-
-		if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
-			attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
 	} else {
 		/* The initial sample period (overflow count) of an event. */
 		if (kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability
  2019-10-08 16:01 [PATCH v2 0/5] KVM: arm64: Assorted PMU emulation fixes Marc Zyngier
                   ` (2 preceding siblings ...)
  2019-10-08 16:01 ` [PATCH v2 3/5] KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel event Marc Zyngier
@ 2019-10-08 16:01 ` Marc Zyngier
  2019-10-08 17:55   ` Marc Zyngier
  2019-10-08 19:52   ` Andrew Murray
  2019-10-08 16:01 ` [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling Marc Zyngier
  4 siblings, 2 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-10-08 16:01 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Will Deacon

As KVM uses perf as a way to emulate an ARMv8 PMU, it needs to
be able to change the sample period as part of the overflow
handling (once an overflow has taken place, the following
overflow point is the overflow of the virtual counter).

Deleting and recreating the in-kernel event is difficult, as
we're in interrupt context. Instead, we can teach the PMU driver
a new trick, which is to stop the event before the overflow handling,
and reprogram it once it has been handled. This would give KVM
the opportunity to adjust the next sample period. This feature
is gated on a new flag that can get set by KVM in a subsequent
patch.

Whilst we're at it, move the CHAINED flag from the KVM emulation
to the perf_event.h file and adjust the PMU code accordingly.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/include/asm/perf_event.h | 4 ++++
 arch/arm64/kernel/perf_event.c      | 8 +++++++-
 virt/kvm/arm/pmu.c                  | 4 +---
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 2bdbc79bbd01..8b6b38f2db8e 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -223,4 +223,8 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	(regs)->pstate = PSR_MODE_EL1h;	\
 }
 
+/* Flags used by KVM, among others */
+#define PERF_ATTR_CFG1_CHAINED_EVENT	(1U << 0)
+#define PERF_ATTR_CFG1_RELOAD_EVENT	(1U << 1)
+
 #endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a0b4f1bca491..98907c9e5508 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -322,7 +322,7 @@ PMU_FORMAT_ATTR(long, "config1:0");
 
 static inline bool armv8pmu_event_is_64bit(struct perf_event *event)
 {
-	return event->attr.config1 & 0x1;
+	return event->attr.config1 & PERF_ATTR_CFG1_CHAINED_EVENT;
 }
 
 static struct attribute *armv8_pmuv3_format_attrs[] = {
@@ -736,8 +736,14 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
+		if (event->attr.config1 & PERF_ATTR_CFG1_RELOAD_EVENT)
+			cpu_pmu->pmu.stop(event, PERF_EF_RELOAD);
+
 		if (perf_event_overflow(event, &data, regs))
 			cpu_pmu->disable(event);
+
+		if (event->attr.config1 & PERF_ATTR_CFG1_RELOAD_EVENT)
+			cpu_pmu->pmu.start(event, PERF_EF_RELOAD);
 	}
 	armv8pmu_start(cpu_pmu);
 
diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index f291d4ac3519..25a483a04beb 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -15,8 +15,6 @@
 
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
 
-#define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
-
 /**
  * kvm_pmu_idx_is_64bit - determine if select_idx is a 64bit counter
  * @vcpu: The vcpu pointer
@@ -570,7 +568,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 		 */
 		attr.sample_period = (-counter) & GENMASK(63, 0);
 		if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
-			attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
+			attr.config1 |= PERF_ATTR_CFG1_CHAINED_EVENT;
 
 		event = perf_event_create_kernel_counter(&attr, -1, current,
 							 kvm_pmu_perf_overflow,
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling
  2019-10-08 16:01 [PATCH v2 0/5] KVM: arm64: Assorted PMU emulation fixes Marc Zyngier
                   ` (3 preceding siblings ...)
  2019-10-08 16:01 ` [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability Marc Zyngier
@ 2019-10-08 16:01 ` Marc Zyngier
  2019-10-08 22:42   ` Andrew Murray
  4 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2019-10-08 16:01 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Will Deacon

The PMU emulation code uses the perf event sample period to trigger
the overflow detection. This works fine  for the *first* overflow
handling, but results in a huge number of interrupts on the host,
unrelated to the number of interrupts handled in the guest (a x20
factor is pretty common for the cycle counter). On a slow system
(such as a SW model), this can result in the guest only making
forward progress at a glacial pace.

It turns out that the clue is in the name. The sample period is
exactly that: a period. And once the an overflow has occured,
the following period should be the full width of the associated
counter, instead of whatever the guest had initially programed.

Reset the sample period to the architected value in the overflow
handler, which now results in a number of host interrupts that is
much closer to the number of interrupts in the guest.

Fixes: b02386eb7dac ("arm64: KVM: Add PMU overflow interrupt routing")
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 virt/kvm/arm/pmu.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 25a483a04beb..8b524d74c68a 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -442,6 +442,20 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
 	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
 	struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
 	int idx = pmc->idx;
+	u64 period;
+
+	/*
+	 * Reset the sample period to the architectural limit,
+	 * i.e. the point where the counter overflows.
+	 */
+	period = -(local64_read(&pmc->perf_event->count));
+
+	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
+		period &= GENMASK(31, 0);
+
+	local64_set(&pmc->perf_event->hw.period_left, 0);
+	pmc->perf_event->attr.sample_period = period;
+	pmc->perf_event->hw.sample_period = period;
 
 	__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(idx);
 
@@ -557,6 +571,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 	attr.exclude_host = 1; /* Don't count host events */
 	attr.config = (pmc->idx == ARMV8_PMU_CYCLE_IDX) ?
 		ARMV8_PMUV3_PERFCTR_CPU_CYCLES : eventsel;
+	attr.config1 = PERF_ATTR_CFG1_RELOAD_EVENT;
 
 	counter = kvm_pmu_get_pair_counter_value(vcpu, pmc);
 
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability
  2019-10-08 16:01 ` [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability Marc Zyngier
@ 2019-10-08 17:55   ` Marc Zyngier
  2019-10-08 19:52   ` Andrew Murray
  1 sibling, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-10-08 17:55 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Will Deacon

On Tue,  8 Oct 2019 17:01:27 +0100
Marc Zyngier <maz@kernel.org> wrote:

> As KVM uses perf as a way to emulate an ARMv8 PMU, it needs to
> be able to change the sample period as part of the overflow
> handling (once an overflow has taken place, the following
> overflow point is the overflow of the virtual counter).
> 
> Deleting and recreating the in-kernel event is difficult, as
> we're in interrupt context. Instead, we can teach the PMU driver
> a new trick, which is to stop the event before the overflow handling,
> and reprogram it once it has been handled. This would give KVM
> the opportunity to adjust the next sample period. This feature
> is gated on a new flag that can get set by KVM in a subsequent
> patch.
> 
> Whilst we're at it, move the CHAINED flag from the KVM emulation
> to the perf_event.h file and adjust the PMU code accordingly.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  arch/arm64/include/asm/perf_event.h | 4 ++++
>  arch/arm64/kernel/perf_event.c      | 8 +++++++-
>  virt/kvm/arm/pmu.c                  | 4 +---
>  3 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> index 2bdbc79bbd01..8b6b38f2db8e 100644
> --- a/arch/arm64/include/asm/perf_event.h
> +++ b/arch/arm64/include/asm/perf_event.h
> @@ -223,4 +223,8 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
>  	(regs)->pstate = PSR_MODE_EL1h;	\
>  }
>  
> +/* Flags used by KVM, among others */
> +#define PERF_ATTR_CFG1_CHAINED_EVENT	(1U << 0)
> +#define PERF_ATTR_CFG1_RELOAD_EVENT	(1U << 1)
> +
>  #endif
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index a0b4f1bca491..98907c9e5508 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -322,7 +322,7 @@ PMU_FORMAT_ATTR(long, "config1:0");
>  
>  static inline bool armv8pmu_event_is_64bit(struct perf_event *event)
>  {
> -	return event->attr.config1 & 0x1;
> +	return event->attr.config1 & PERF_ATTR_CFG1_CHAINED_EVENT;
>  }
>  
>  static struct attribute *armv8_pmuv3_format_attrs[] = {
> @@ -736,8 +736,14 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>  		if (!armpmu_event_set_period(event))
>  			continue;
>  
> +		if (event->attr.config1 & PERF_ATTR_CFG1_RELOAD_EVENT)
> +			cpu_pmu->pmu.stop(event, PERF_EF_RELOAD);
> +

Actually, I just realized that there is probably no need for this patch
as a standalone change. I can perfectly fold the stop() and start()
calls into the last patch, as part of the overflow handler.

The question is still whether that's a good idea or not.

Thanks,

	M.


>  		if (perf_event_overflow(event, &data, regs))
>  			cpu_pmu->disable(event);
> +
> +		if (event->attr.config1 & PERF_ATTR_CFG1_RELOAD_EVENT)
> +			cpu_pmu->pmu.start(event, PERF_EF_RELOAD);
>  	}
>  	armv8pmu_start(cpu_pmu);
>  
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index f291d4ac3519..25a483a04beb 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -15,8 +15,6 @@
>  
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
>  
> -#define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> -
>  /**
>   * kvm_pmu_idx_is_64bit - determine if select_idx is a 64bit counter
>   * @vcpu: The vcpu pointer
> @@ -570,7 +568,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  		 */
>  		attr.sample_period = (-counter) & GENMASK(63, 0);
>  		if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
> -			attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
> +			attr.config1 |= PERF_ATTR_CFG1_CHAINED_EVENT;
>  
>  		event = perf_event_create_kernel_counter(&attr, -1, current,
>  							 kvm_pmu_perf_overflow,



-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/5] KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel event
  2019-10-08 16:01 ` [PATCH v2 3/5] KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel event Marc Zyngier
@ 2019-10-08 19:22   ` Andrew Murray
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Murray @ 2019-10-08 19:22 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Oct 08, 2019 at 05:01:26PM +0100, Marc Zyngier wrote:
> The current convention for KVM to request a chained event from the
> host PMU is to set bit[0] in attr.config1 (PERF_ATTR_CFG1_KVM_PMU_CHAINED).
> 
> But as it turns out, this bit gets set *after* we create the kernel
> event that backs our virtual counter, meaning that we never get
> a 64bit counter.
> 
> Moving the setting to an earlier point solves the problem.
> 
> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
> Signed-off-by: Marc Zyngier <maz@kernel.org>

Reviewed-by: Andrew Murray <andrew.murray@arm.com>

> ---
>  virt/kvm/arm/pmu.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index c30c3a74fc7f..f291d4ac3519 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -569,12 +569,12 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  		 * high counter.
>  		 */
>  		attr.sample_period = (-counter) & GENMASK(63, 0);
> +		if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
> +			attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
> +
>  		event = perf_event_create_kernel_counter(&attr, -1, current,
>  							 kvm_pmu_perf_overflow,
>  							 pmc + 1);
> -
> -		if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
> -			attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
>  	} else {
>  		/* The initial sample period (overflow count) of an event. */
>  		if (kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> -- 
> 2.20.1
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability
  2019-10-08 16:01 ` [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability Marc Zyngier
  2019-10-08 17:55   ` Marc Zyngier
@ 2019-10-08 19:52   ` Andrew Murray
  1 sibling, 0 replies; 12+ messages in thread
From: Andrew Murray @ 2019-10-08 19:52 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Oct 08, 2019 at 05:01:27PM +0100, Marc Zyngier wrote:
> As KVM uses perf as a way to emulate an ARMv8 PMU, it needs to
> be able to change the sample period as part of the overflow
> handling (once an overflow has taken place, the following
> overflow point is the overflow of the virtual counter).
> 
> Deleting and recreating the in-kernel event is difficult, as
> we're in interrupt context. Instead, we can teach the PMU driver
> a new trick, which is to stop the event before the overflow handling,
> and reprogram it once it has been handled. This would give KVM
> the opportunity to adjust the next sample period. This feature
> is gated on a new flag that can get set by KVM in a subsequent
> patch.
> 
> Whilst we're at it, move the CHAINED flag from the KVM emulation
> to the perf_event.h file and adjust the PMU code accordingly.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  arch/arm64/include/asm/perf_event.h | 4 ++++
>  arch/arm64/kernel/perf_event.c      | 8 +++++++-
>  virt/kvm/arm/pmu.c                  | 4 +---
>  3 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> index 2bdbc79bbd01..8b6b38f2db8e 100644
> --- a/arch/arm64/include/asm/perf_event.h
> +++ b/arch/arm64/include/asm/perf_event.h
> @@ -223,4 +223,8 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
>  	(regs)->pstate = PSR_MODE_EL1h;	\
>  }
>  
> +/* Flags used by KVM, among others */
> +#define PERF_ATTR_CFG1_CHAINED_EVENT	(1U << 0)
> +#define PERF_ATTR_CFG1_RELOAD_EVENT	(1U << 1)
> +
>  #endif
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index a0b4f1bca491..98907c9e5508 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -322,7 +322,7 @@ PMU_FORMAT_ATTR(long, "config1:0");
>  
>  static inline bool armv8pmu_event_is_64bit(struct perf_event *event)
>  {
> -	return event->attr.config1 & 0x1;
> +	return event->attr.config1 & PERF_ATTR_CFG1_CHAINED_EVENT;

I'm pleased to see this be replaced with a define, it helps readers see the
link between this and the KVM driver.

>  }
>  
>  static struct attribute *armv8_pmuv3_format_attrs[] = {
> @@ -736,8 +736,14 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>  		if (!armpmu_event_set_period(event))
>  			continue;
>  
> +		if (event->attr.config1 & PERF_ATTR_CFG1_RELOAD_EVENT)
> +			cpu_pmu->pmu.stop(event, PERF_EF_RELOAD);

I believe PERF_EF_RELOAD is only intended to be used in the stop calls. I'd
suggest that you replace it with PERF_EF_UPDATE instead, this tells the PMU
to update the counter with the latest value from the hardware. (Though the
ARM PMU driver always does this regardless to the flag anyway).

Thanks,

Andrew Murray

> +
>  		if (perf_event_overflow(event, &data, regs))
>  			cpu_pmu->disable(event);
> +
> +		if (event->attr.config1 & PERF_ATTR_CFG1_RELOAD_EVENT)
> +			cpu_pmu->pmu.start(event, PERF_EF_RELOAD);
>  	}
>  	armv8pmu_start(cpu_pmu);
>  
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index f291d4ac3519..25a483a04beb 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -15,8 +15,6 @@
>  
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
>  
> -#define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> -
>  /**
>   * kvm_pmu_idx_is_64bit - determine if select_idx is a 64bit counter
>   * @vcpu: The vcpu pointer
> @@ -570,7 +568,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  		 */
>  		attr.sample_period = (-counter) & GENMASK(63, 0);
>  		if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
> -			attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
> +			attr.config1 |= PERF_ATTR_CFG1_CHAINED_EVENT;
>  
>  		event = perf_event_create_kernel_counter(&attr, -1, current,
>  							 kvm_pmu_perf_overflow,
> -- 
> 2.20.1
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling
  2019-10-08 16:01 ` [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling Marc Zyngier
@ 2019-10-08 22:42   ` Andrew Murray
  2019-10-11 11:28     ` Marc Zyngier
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Murray @ 2019-10-08 22:42 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Oct 08, 2019 at 05:01:28PM +0100, Marc Zyngier wrote:
> The PMU emulation code uses the perf event sample period to trigger
> the overflow detection. This works fine  for the *first* overflow
> handling, but results in a huge number of interrupts on the host,
> unrelated to the number of interrupts handled in the guest (a x20
> factor is pretty common for the cycle counter). On a slow system
> (such as a SW model), this can result in the guest only making
> forward progress at a glacial pace.
> 
> It turns out that the clue is in the name. The sample period is
> exactly that: a period. And once the an overflow has occured,
> the following period should be the full width of the associated
> counter, instead of whatever the guest had initially programed.
> 
> Reset the sample period to the architected value in the overflow
> handler, which now results in a number of host interrupts that is
> much closer to the number of interrupts in the guest.
> 
> Fixes: b02386eb7dac ("arm64: KVM: Add PMU overflow interrupt routing")
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  virt/kvm/arm/pmu.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 25a483a04beb..8b524d74c68a 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -442,6 +442,20 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
>  	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
>  	struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
>  	int idx = pmc->idx;
> +	u64 period;
> +
> +	/*
> +	 * Reset the sample period to the architectural limit,
> +	 * i.e. the point where the counter overflows.
> +	 */
> +	period = -(local64_read(&pmc->perf_event->count));
> +
> +	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> +		period &= GENMASK(31, 0);
> +
> +	local64_set(&pmc->perf_event->hw.period_left, 0);
> +	pmc->perf_event->attr.sample_period = period;
> +	pmc->perf_event->hw.sample_period = period;

I believe that above, you are reducing the period by the amount period_left
would have been - they cancel each other out.

Given that kvm_pmu_perf_overflow is now always called between a
cpu_pmu->pmu.stop and a cpu_pmu->pmu.start, it means armpmu_event_update
has been called prior to this function, and armpmu_event_set_period will
be called after...

Therefore, I think the above could be reduced to:

+	/*
+	 * Reset the sample period to the architectural limit,
+	 * i.e. the point where the counter overflows.
+	 */
+	u64 period = GENMASK(63, 0);
+	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
+		period = GENMASK(31, 0);
+
+	pmc->perf_event->attr.sample_period = period;
+	pmc->perf_event->hw.sample_period = period;

This is because armpmu_event_set_period takes into account the overflow
and the counter wrapping via the "if (unlikely(left <= 0)) {" block.

Though this code confuses me easily, so I may be talking rubbish.

>  
>  	__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(idx);
>  
> @@ -557,6 +571,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  	attr.exclude_host = 1; /* Don't count host events */
>  	attr.config = (pmc->idx == ARMV8_PMU_CYCLE_IDX) ?
>  		ARMV8_PMUV3_PERFCTR_CPU_CYCLES : eventsel;
> +	attr.config1 = PERF_ATTR_CFG1_RELOAD_EVENT;

I'm not sure that this flag, or patch 4 is really needed. As the perf
events created by KVM are pinned to the task and exclude_(host,hv) are set -
I think the perf event is not active at this point. Therefore if you change
the sample period, you can wait until the perf event gets scheduled back in
(when you return to the guest) where it's call to pmu.start will result in
armpmu_event_set_period being called. In other words the pmu.start and
pmu.stop you add in patch 4 is effectively being done for you by perf when
the KVM task is switched out.

I'd be interested to see if the following works:

+	WARN_ON(pmc->perf_event->state == PERF_EVENT_STATE_ACTIVE)
+
+	/*
+	 * Reset the sample period to the architectural limit,
+	 * i.e. the point where the counter overflows.
+	 */
+	u64 period = GENMASK(63, 0);
+	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
+		period = GENMASK(31, 0);
+
+	pmc->perf_event->attr.sample_period = period;
+	pmc->perf_event->hw.sample_period = period;

>  
>  	counter = kvm_pmu_get_pair_counter_value(vcpu, pmc);
>  

What about ARM 32 bit support for this?

Thanks,

Andrew Murray

> -- 
> 2.20.1
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling
  2019-10-08 22:42   ` Andrew Murray
@ 2019-10-11 11:28     ` Marc Zyngier
  2019-10-11 11:41       ` Andrew Murray
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2019-10-11 11:28 UTC (permalink / raw)
  To: Andrew Murray; +Cc: kvm, Will Deacon, kvmarm, linux-arm-kernel

On Tue, 8 Oct 2019 23:42:22 +0100
Andrew Murray <andrew.murray@arm.com> wrote:

> On Tue, Oct 08, 2019 at 05:01:28PM +0100, Marc Zyngier wrote:
> > The PMU emulation code uses the perf event sample period to trigger
> > the overflow detection. This works fine  for the *first* overflow
> > handling, but results in a huge number of interrupts on the host,
> > unrelated to the number of interrupts handled in the guest (a x20
> > factor is pretty common for the cycle counter). On a slow system
> > (such as a SW model), this can result in the guest only making
> > forward progress at a glacial pace.
> > 
> > It turns out that the clue is in the name. The sample period is
> > exactly that: a period. And once the an overflow has occured,
> > the following period should be the full width of the associated
> > counter, instead of whatever the guest had initially programed.
> > 
> > Reset the sample period to the architected value in the overflow
> > handler, which now results in a number of host interrupts that is
> > much closer to the number of interrupts in the guest.
> > 
> > Fixes: b02386eb7dac ("arm64: KVM: Add PMU overflow interrupt routing")
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> >  virt/kvm/arm/pmu.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> > 
> > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > index 25a483a04beb..8b524d74c68a 100644
> > --- a/virt/kvm/arm/pmu.c
> > +++ b/virt/kvm/arm/pmu.c
> > @@ -442,6 +442,20 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
> >  	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
> >  	struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
> >  	int idx = pmc->idx;
> > +	u64 period;
> > +
> > +	/*
> > +	 * Reset the sample period to the architectural limit,
> > +	 * i.e. the point where the counter overflows.
> > +	 */
> > +	period = -(local64_read(&pmc->perf_event->count));
> > +
> > +	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> > +		period &= GENMASK(31, 0);
> > +
> > +	local64_set(&pmc->perf_event->hw.period_left, 0);
> > +	pmc->perf_event->attr.sample_period = period;
> > +	pmc->perf_event->hw.sample_period = period;  
> 
> I believe that above, you are reducing the period by the amount period_left
> would have been - they cancel each other out.

That's not what I see happening, having put some traces:

 kvm_pmu_perf_overflow: count = 308 left = 129
 kvm_pmu_perf_overflow: count = 409 left = 47
 kvm_pmu_perf_overflow: count = 585 left = 223
 kvm_pmu_perf_overflow: count = 775 left = 413
 kvm_pmu_perf_overflow: count = 1368 left = 986
 kvm_pmu_perf_overflow: count = 2086 left = 1716
 kvm_pmu_perf_overflow: count = 958 left = 584
 kvm_pmu_perf_overflow: count = 1907 left = 1551
 kvm_pmu_perf_overflow: count = 7292 left = 6932

although I've now moved the stop/start calls inside the overflow
handler so that I don't have to mess with the PMU backend.

> Given that kvm_pmu_perf_overflow is now always called between a
> cpu_pmu->pmu.stop and a cpu_pmu->pmu.start, it means armpmu_event_update
> has been called prior to this function, and armpmu_event_set_period will
> be called after...
> 
> Therefore, I think the above could be reduced to:
> 
> +	/*
> +	 * Reset the sample period to the architectural limit,
> +	 * i.e. the point where the counter overflows.
> +	 */
> +	u64 period = GENMASK(63, 0);
> +	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> +		period = GENMASK(31, 0);
> +
> +	pmc->perf_event->attr.sample_period = period;
> +	pmc->perf_event->hw.sample_period = period;
> 
> This is because armpmu_event_set_period takes into account the overflow
> and the counter wrapping via the "if (unlikely(left <= 0)) {" block.

I think that's an oversimplification. As shown above, the counter has
moved forward, and there is a delta to be accounted for.

> Though this code confuses me easily, so I may be talking rubbish.

Same here! ;-)

> 
> >  
> >  	__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(idx);
> >  
> > @@ -557,6 +571,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> >  	attr.exclude_host = 1; /* Don't count host events */
> >  	attr.config = (pmc->idx == ARMV8_PMU_CYCLE_IDX) ?
> >  		ARMV8_PMUV3_PERFCTR_CPU_CYCLES : eventsel;
> > +	attr.config1 = PERF_ATTR_CFG1_RELOAD_EVENT;  
> 
> I'm not sure that this flag, or patch 4 is really needed. As the perf
> events created by KVM are pinned to the task and exclude_(host,hv) are set -
> I think the perf event is not active at this point. Therefore if you change
> the sample period, you can wait until the perf event gets scheduled back in
> (when you return to the guest) where it's call to pmu.start will result in
> armpmu_event_set_period being called. In other words the pmu.start and
> pmu.stop you add in patch 4 is effectively being done for you by perf when
> the KVM task is switched out.
> 
> I'd be interested to see if the following works:
> 
> +	WARN_ON(pmc->perf_event->state == PERF_EVENT_STATE_ACTIVE)
> +
> +	/*
> +	 * Reset the sample period to the architectural limit,
> +	 * i.e. the point where the counter overflows.
> +	 */
> +	u64 period = GENMASK(63, 0);
> +	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> +		period = GENMASK(31, 0);
> +
> +	pmc->perf_event->attr.sample_period = period;
> +	pmc->perf_event->hw.sample_period = period;
> 
> >  
> >  	counter = kvm_pmu_get_pair_counter_value(vcpu, pmc);
> >    

The warning fires, which is expected: for event to be inactive, you
need to have the vcpu being scheduled out. When the PMU interrupt
fires, it is bound to preempt the vcpu itself, and the event is of
course still active.

> What about ARM 32 bit support for this?

What about it? 32bit KVM/arm doesn't support the PMU at all. A 32bit
guest on a 64bit host could use the PMU just fine (it is just that
32bit Linux doesn't have a PMUv3 driver -- I had patches for that, but
they never made it upstream).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling
  2019-10-11 11:28     ` Marc Zyngier
@ 2019-10-11 11:41       ` Andrew Murray
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Murray @ 2019-10-11 11:41 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Oct 11, 2019 at 12:28:48PM +0100, Marc Zyngier wrote:
> On Tue, 8 Oct 2019 23:42:22 +0100
> Andrew Murray <andrew.murray@arm.com> wrote:
> 
> > On Tue, Oct 08, 2019 at 05:01:28PM +0100, Marc Zyngier wrote:
> > > The PMU emulation code uses the perf event sample period to trigger
> > > the overflow detection. This works fine  for the *first* overflow
> > > handling, but results in a huge number of interrupts on the host,
> > > unrelated to the number of interrupts handled in the guest (a x20
> > > factor is pretty common for the cycle counter). On a slow system
> > > (such as a SW model), this can result in the guest only making
> > > forward progress at a glacial pace.
> > > 
> > > It turns out that the clue is in the name. The sample period is
> > > exactly that: a period. And once the an overflow has occured,
> > > the following period should be the full width of the associated
> > > counter, instead of whatever the guest had initially programed.
> > > 
> > > Reset the sample period to the architected value in the overflow
> > > handler, which now results in a number of host interrupts that is
> > > much closer to the number of interrupts in the guest.
> > > 
> > > Fixes: b02386eb7dac ("arm64: KVM: Add PMU overflow interrupt routing")
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > >  virt/kvm/arm/pmu.c | 15 +++++++++++++++
> > >  1 file changed, 15 insertions(+)
> > > 
> > > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > > index 25a483a04beb..8b524d74c68a 100644
> > > --- a/virt/kvm/arm/pmu.c
> > > +++ b/virt/kvm/arm/pmu.c
> > > @@ -442,6 +442,20 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
> > >  	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
> > >  	struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
> > >  	int idx = pmc->idx;
> > > +	u64 period;
> > > +
> > > +	/*
> > > +	 * Reset the sample period to the architectural limit,
> > > +	 * i.e. the point where the counter overflows.
> > > +	 */
> > > +	period = -(local64_read(&pmc->perf_event->count));
> > > +
> > > +	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> > > +		period &= GENMASK(31, 0);
> > > +
> > > +	local64_set(&pmc->perf_event->hw.period_left, 0);
> > > +	pmc->perf_event->attr.sample_period = period;
> > > +	pmc->perf_event->hw.sample_period = period;  
> > 
> > I believe that above, you are reducing the period by the amount period_left
> > would have been - they cancel each other out.
> 
> That's not what I see happening, having put some traces:
> 
>  kvm_pmu_perf_overflow: count = 308 left = 129
>  kvm_pmu_perf_overflow: count = 409 left = 47
>  kvm_pmu_perf_overflow: count = 585 left = 223
>  kvm_pmu_perf_overflow: count = 775 left = 413
>  kvm_pmu_perf_overflow: count = 1368 left = 986
>  kvm_pmu_perf_overflow: count = 2086 left = 1716
>  kvm_pmu_perf_overflow: count = 958 left = 584
>  kvm_pmu_perf_overflow: count = 1907 left = 1551
>  kvm_pmu_perf_overflow: count = 7292 left = 6932

Indeed.

> 
> although I've now moved the stop/start calls inside the overflow
> handler so that I don't have to mess with the PMU backend.
> 
> > Given that kvm_pmu_perf_overflow is now always called between a
> > cpu_pmu->pmu.stop and a cpu_pmu->pmu.start, it means armpmu_event_update
> > has been called prior to this function, and armpmu_event_set_period will
> > be called after...
> > 
> > Therefore, I think the above could be reduced to:
> > 
> > +	/*
> > +	 * Reset the sample period to the architectural limit,
> > +	 * i.e. the point where the counter overflows.
> > +	 */
> > +	u64 period = GENMASK(63, 0);
> > +	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> > +		period = GENMASK(31, 0);
> > +
> > +	pmc->perf_event->attr.sample_period = period;
> > +	pmc->perf_event->hw.sample_period = period;
> > 
> > This is because armpmu_event_set_period takes into account the overflow
> > and the counter wrapping via the "if (unlikely(left <= 0)) {" block.
> 
> I think that's an oversimplification. As shown above, the counter has
> moved forward, and there is a delta to be accounted for.
> 

Yeah, I probably need to spend more time understanding this...

> > Though this code confuses me easily, so I may be talking rubbish.
> 
> Same here! ;-)
> 
> > 
> > >  
> > >  	__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(idx);
> > >  
> > > @@ -557,6 +571,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> > >  	attr.exclude_host = 1; /* Don't count host events */
> > >  	attr.config = (pmc->idx == ARMV8_PMU_CYCLE_IDX) ?
> > >  		ARMV8_PMUV3_PERFCTR_CPU_CYCLES : eventsel;
> > > +	attr.config1 = PERF_ATTR_CFG1_RELOAD_EVENT;  
> > 
> > I'm not sure that this flag, or patch 4 is really needed. As the perf
> > events created by KVM are pinned to the task and exclude_(host,hv) are set -
> > I think the perf event is not active at this point. Therefore if you change
> > the sample period, you can wait until the perf event gets scheduled back in
> > (when you return to the guest) where it's call to pmu.start will result in
> > armpmu_event_set_period being called. In other words the pmu.start and
> > pmu.stop you add in patch 4 is effectively being done for you by perf when
> > the KVM task is switched out.
> > 
> > I'd be interested to see if the following works:
> > 
> > +	WARN_ON(pmc->perf_event->state == PERF_EVENT_STATE_ACTIVE)
> > +
> > +	/*
> > +	 * Reset the sample period to the architectural limit,
> > +	 * i.e. the point where the counter overflows.
> > +	 */
> > +	u64 period = GENMASK(63, 0);
> > +	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
> > +		period = GENMASK(31, 0);
> > +
> > +	pmc->perf_event->attr.sample_period = period;
> > +	pmc->perf_event->hw.sample_period = period;
> > 
> > >  
> > >  	counter = kvm_pmu_get_pair_counter_value(vcpu, pmc);
> > >    
> 
> The warning fires, which is expected: for event to be inactive, you
> need to have the vcpu being scheduled out. When the PMU interrupt
> fires, it is bound to preempt the vcpu itself, and the event is of
> course still active.

That makes sense. That also provides a justification for stopping and
starting the PMU.

> 
> > What about ARM 32 bit support for this?
> 
> What about it? 32bit KVM/arm doesn't support the PMU at all.

Thanks for the clarification.

Andrew Murray

> A 32bit
> guest on a 64bit host could use the PMU just fine (it is just that
> 32bit Linux doesn't have a PMUv3 driver -- I had patches for that, but
> they never made it upstream).
> 
> Thanks,
> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08 16:01 [PATCH v2 0/5] KVM: arm64: Assorted PMU emulation fixes Marc Zyngier
2019-10-08 16:01 ` [PATCH v2 1/5] KVM: arm64: pmu: Fix cycle counter truncation Marc Zyngier
2019-10-08 16:01 ` [PATCH v2 2/5] arm64: KVM: Handle PMCR_EL0.LC as RES1 on pure AArch64 systems Marc Zyngier
2019-10-08 16:01 ` [PATCH v2 3/5] KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel event Marc Zyngier
2019-10-08 19:22   ` Andrew Murray
2019-10-08 16:01 ` [PATCH v2 4/5] arm64: perf: Add reload-on-overflow capability Marc Zyngier
2019-10-08 17:55   ` Marc Zyngier
2019-10-08 19:52   ` Andrew Murray
2019-10-08 16:01 ` [PATCH v2 5/5] KVM: arm64: pmu: Reset sample period on overflow handling Marc Zyngier
2019-10-08 22:42   ` Andrew Murray
2019-10-11 11:28     ` Marc Zyngier
2019-10-11 11:41       ` Andrew Murray

KVM ARM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvmarm/0 kvmarm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvmarm kvmarm/ https://lore.kernel.org/kvmarm \
		kvmarm@lists.cs.columbia.edu
	public-inbox-index kvmarm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/edu.columbia.cs.lists.kvmarm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git