All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] KVM/ARM: Misc PMU fixes
@ 2019-12-04 20:44 ` Eric Auger
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm
  Cc: james.morse, andrew.murray, suzuki.poulose, drjones

While writing new PMUv3 event counter KVM unit tests I found 3
things that do not seem to comply with the specification and at
least need to be confirmed.

Two are related to SW_INCR implementation: no check of the
PMCR.E bit, no support of 64b (CHAIN). From the spec,
I do not understand the SW_INCR behaves differently from
other events but I may be wrong.

The last minor thing is about the PMEVTYPER read-only bits.
On Seattle we have an 8.0 implementation which I understand
is supposed to implement only 10-bit evtCount field which is
not enforced.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v5.4-pmu-kut-fixes-v1

Eric Auger (3):
  KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
  KVM: arm64: pmu: Fix chained SW_INCR counters
  KVM: arm64: pmu: Enforce PMEVTYPER evtCount size

 arch/arm64/include/asm/perf_event.h |  5 ++++-
 arch/arm64/include/asm/sysreg.h     |  5 +++++
 arch/arm64/kernel/perf_event.c      |  2 +-
 arch/arm64/kvm/sys_regs.c           | 14 ++++++++++----
 virt/kvm/arm/pmu.c                  | 19 ++++++++++++++++++-
 5 files changed, 38 insertions(+), 7 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC 0/3] KVM/ARM: Misc PMU fixes
@ 2019-12-04 20:44 ` Eric Auger
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm

While writing new PMUv3 event counter KVM unit tests I found 3
things that do not seem to comply with the specification and at
least need to be confirmed.

Two are related to SW_INCR implementation: no check of the
PMCR.E bit, no support of 64b (CHAIN). From the spec,
I do not understand the SW_INCR behaves differently from
other events but I may be wrong.

The last minor thing is about the PMEVTYPER read-only bits.
On Seattle we have an 8.0 implementation which I understand
is supposed to implement only 10-bit evtCount field which is
not enforced.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v5.4-pmu-kut-fixes-v1

Eric Auger (3):
  KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
  KVM: arm64: pmu: Fix chained SW_INCR counters
  KVM: arm64: pmu: Enforce PMEVTYPER evtCount size

 arch/arm64/include/asm/perf_event.h |  5 ++++-
 arch/arm64/include/asm/sysreg.h     |  5 +++++
 arch/arm64/kernel/perf_event.c      |  2 +-
 arch/arm64/kvm/sys_regs.c           | 14 ++++++++++----
 virt/kvm/arm/pmu.c                  | 19 ++++++++++++++++++-
 5 files changed, 38 insertions(+), 7 deletions(-)

-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC 1/3] KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
  2019-12-04 20:44 ` Eric Auger
@ 2019-12-04 20:44   ` Eric Auger
  -1 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm
  Cc: james.morse, andrew.murray, suzuki.poulose, drjones

The specification says PMSWINC increments PMEVCNTR<n>_EL1 by 1
if PMEVCNTR<n>_EL0 is enabled and configured to count SW_INCR.

For PMEVCNTR<n>_EL0 to be enabled, we need both PMCNTENSET to
be set for the corresponding event counter but we also need
the PMCR.E bit to be set.

Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC register")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 virt/kvm/arm/pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 8731dfeced8b..c3f8b059881e 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -486,6 +486,9 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 	if (val == 0)
 		return;
 
+	if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
+		return;
+
 	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
 	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
 		if (!(val & BIT(i)))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 1/3] KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
@ 2019-12-04 20:44   ` Eric Auger
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm

The specification says PMSWINC increments PMEVCNTR<n>_EL1 by 1
if PMEVCNTR<n>_EL0 is enabled and configured to count SW_INCR.

For PMEVCNTR<n>_EL0 to be enabled, we need both PMCNTENSET to
be set for the corresponding event counter but we also need
the PMCR.E bit to be set.

Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC register")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 virt/kvm/arm/pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 8731dfeced8b..c3f8b059881e 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -486,6 +486,9 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 	if (val == 0)
 		return;
 
+	if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
+		return;
+
 	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
 	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
 		if (!(val & BIT(i)))
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-04 20:44 ` Eric Auger
@ 2019-12-04 20:44   ` Eric Auger
  -1 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm
  Cc: james.morse, andrew.murray, suzuki.poulose, drjones

At the moment a SW_INCR counter always overflows on 32-bit
boundary, independently on whether the n+1th counter is
programmed as CHAIN.

Check whether the SW_INCR counter is a 64b counter and if so,
implement the 64b logic.

Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 virt/kvm/arm/pmu.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index c3f8b059881e..7ab477db2f75 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 
 	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
 	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
+		bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
+
 		if (!(val & BIT(i)))
 			continue;
 		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
@@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 			reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
 			reg = lower_32_bits(reg);
 			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
-			if (!reg)
+			if (reg) /* no overflow */
+				continue;
+			if (chained) {
+				reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
+				reg = lower_32_bits(reg);
+				__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
+				if (reg)
+					continue;
+				/* mark an overflow on high counter */
+				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
+			} else {
+				/* mark an overflow */
 				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
+			}
 		}
 	}
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-04 20:44   ` Eric Auger
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm

At the moment a SW_INCR counter always overflows on 32-bit
boundary, independently on whether the n+1th counter is
programmed as CHAIN.

Check whether the SW_INCR counter is a 64b counter and if so,
implement the 64b logic.

Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 virt/kvm/arm/pmu.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index c3f8b059881e..7ab477db2f75 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 
 	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
 	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
+		bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
+
 		if (!(val & BIT(i)))
 			continue;
 		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
@@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 			reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
 			reg = lower_32_bits(reg);
 			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
-			if (!reg)
+			if (reg) /* no overflow */
+				continue;
+			if (chained) {
+				reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
+				reg = lower_32_bits(reg);
+				__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
+				if (reg)
+					continue;
+				/* mark an overflow on high counter */
+				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
+			} else {
+				/* mark an overflow */
 				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
+			}
 		}
 	}
 }
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 3/3] KVM: arm64: pmu: Enforce PMEVTYPER evtCount size
  2019-12-04 20:44 ` Eric Auger
@ 2019-12-04 20:44   ` Eric Auger
  -1 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm
  Cc: james.morse, andrew.murray, suzuki.poulose, drjones

ARMv8.1-PMU supports 16-bit evtCount whereas 8.0 only supports
10 bits.

On Seatlle which has an 8.0 PMU implementation, evtCount[15:10]
are not read as 0, as expected. Fix that by applying a mask on
the selected event that depends on the PMU version.

Also remove a redundant __vcpu_sys_reg() assignment (already
done in kvm_pmu_set_counter_even_type()).

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 arch/arm64/include/asm/perf_event.h |  5 ++++-
 arch/arm64/include/asm/sysreg.h     |  5 +++++
 arch/arm64/kernel/perf_event.c      |  2 +-
 arch/arm64/kvm/sys_regs.c           | 14 ++++++++++----
 4 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 2bdbc79bbd01..37ad1d654d2a 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -189,7 +189,10 @@
 /*
  * PMXEVTYPER: Event selection reg
  */
-#define	ARMV8_PMU_EVTYPE_MASK	0xc800ffff	/* Mask for writable bits */
+/* Mask for writable bits featuring 10b evtCount (ARMv8.0-PMU)*/
+#define	ARMV8_PMU_EVTYPE_MASK	0xc80003ff
+/* Mask for writable bits featuring 16b evtCount (ARMv8.1-PMU)*/
+#define ARMV8_1_PMU_EVTYPE_MASK	0xc800ffff
 #define	ARMV8_PMU_EVTYPE_EVENT	0xffff		/* Mask for EVENT bits */
 
 /*
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 6e919fafb43d..e01b3e3acdf6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -672,6 +672,11 @@
 #define ID_AA64DFR0_TRACEVER_SHIFT	4
 #define ID_AA64DFR0_DEBUGVER_SHIFT	0
 
+#define ID_AA64DFR0_PMUVER_NOT_IMPL	0x0
+#define ID_AA64DFR0_PMUVER_8_0		0x1
+#define ID_AA64DFR0_PMUVER_8_1		0x4
+#define ID_AA64DFR0_PMUVER_IMPL_DEF	0xF
+
 #define ID_ISAR5_RDM_SHIFT		24
 #define ID_ISAR5_CRC32_SHIFT		16
 #define ID_ISAR5_SHA2_SHIFT		12
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index e40b65645c86..d5fe56190ad3 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -425,7 +425,7 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
 static inline void armv8pmu_write_evtype(int idx, u32 val)
 {
 	armv8pmu_select_counter(idx);
-	val &= ARMV8_PMU_EVTYPE_MASK;
+	val &= ARMV8_1_PMU_EVTYPE_MASK;
 	write_sysreg(val, pmxevtyper_el0);
 }
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 46822afc57e0..8deb6485d605 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -815,11 +815,17 @@ static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
 static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 			       const struct sys_reg_desc *r)
 {
-	u64 idx, reg;
+	unsigned int pmuver;
+	u64 idx, reg, dfr0, evtype_mask;
 
 	if (!kvm_arm_pmu_v3_ready(vcpu))
 		return trap_raz_wi(vcpu, p, r);
 
+	dfr0 = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1);
+	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
+						      ID_AA64DFR0_PMUVER_SHIFT);
+	evtype_mask = (pmuver == ID_AA64DFR0_PMUVER_8_0) ?
+			ARMV8_PMU_EVTYPE_MASK : ARMV8_1_PMU_EVTYPE_MASK;
 	if (pmu_access_el0_disabled(vcpu))
 		return false;
 
@@ -842,11 +848,11 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 		return false;
 
 	if (p->is_write) {
-		kvm_pmu_set_counter_event_type(vcpu, p->regval, idx);
-		__vcpu_sys_reg(vcpu, reg) = p->regval & ARMV8_PMU_EVTYPE_MASK;
+		kvm_pmu_set_counter_event_type(vcpu,
+					       p->regval & evtype_mask, idx);
 		kvm_vcpu_pmu_restore_guest(vcpu);
 	} else {
-		p->regval = __vcpu_sys_reg(vcpu, reg) & ARMV8_PMU_EVTYPE_MASK;
+		p->regval = __vcpu_sys_reg(vcpu, reg) & evtype_mask;
 	}
 
 	return true;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 3/3] KVM: arm64: pmu: Enforce PMEVTYPER evtCount size
@ 2019-12-04 20:44   ` Eric Auger
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Auger @ 2019-12-04 20:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, maz, linux-kernel, kvmarm

ARMv8.1-PMU supports 16-bit evtCount whereas 8.0 only supports
10 bits.

On Seatlle which has an 8.0 PMU implementation, evtCount[15:10]
are not read as 0, as expected. Fix that by applying a mask on
the selected event that depends on the PMU version.

Also remove a redundant __vcpu_sys_reg() assignment (already
done in kvm_pmu_set_counter_even_type()).

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 arch/arm64/include/asm/perf_event.h |  5 ++++-
 arch/arm64/include/asm/sysreg.h     |  5 +++++
 arch/arm64/kernel/perf_event.c      |  2 +-
 arch/arm64/kvm/sys_regs.c           | 14 ++++++++++----
 4 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 2bdbc79bbd01..37ad1d654d2a 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -189,7 +189,10 @@
 /*
  * PMXEVTYPER: Event selection reg
  */
-#define	ARMV8_PMU_EVTYPE_MASK	0xc800ffff	/* Mask for writable bits */
+/* Mask for writable bits featuring 10b evtCount (ARMv8.0-PMU)*/
+#define	ARMV8_PMU_EVTYPE_MASK	0xc80003ff
+/* Mask for writable bits featuring 16b evtCount (ARMv8.1-PMU)*/
+#define ARMV8_1_PMU_EVTYPE_MASK	0xc800ffff
 #define	ARMV8_PMU_EVTYPE_EVENT	0xffff		/* Mask for EVENT bits */
 
 /*
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 6e919fafb43d..e01b3e3acdf6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -672,6 +672,11 @@
 #define ID_AA64DFR0_TRACEVER_SHIFT	4
 #define ID_AA64DFR0_DEBUGVER_SHIFT	0
 
+#define ID_AA64DFR0_PMUVER_NOT_IMPL	0x0
+#define ID_AA64DFR0_PMUVER_8_0		0x1
+#define ID_AA64DFR0_PMUVER_8_1		0x4
+#define ID_AA64DFR0_PMUVER_IMPL_DEF	0xF
+
 #define ID_ISAR5_RDM_SHIFT		24
 #define ID_ISAR5_CRC32_SHIFT		16
 #define ID_ISAR5_SHA2_SHIFT		12
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index e40b65645c86..d5fe56190ad3 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -425,7 +425,7 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
 static inline void armv8pmu_write_evtype(int idx, u32 val)
 {
 	armv8pmu_select_counter(idx);
-	val &= ARMV8_PMU_EVTYPE_MASK;
+	val &= ARMV8_1_PMU_EVTYPE_MASK;
 	write_sysreg(val, pmxevtyper_el0);
 }
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 46822afc57e0..8deb6485d605 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -815,11 +815,17 @@ static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
 static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 			       const struct sys_reg_desc *r)
 {
-	u64 idx, reg;
+	unsigned int pmuver;
+	u64 idx, reg, dfr0, evtype_mask;
 
 	if (!kvm_arm_pmu_v3_ready(vcpu))
 		return trap_raz_wi(vcpu, p, r);
 
+	dfr0 = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1);
+	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
+						      ID_AA64DFR0_PMUVER_SHIFT);
+	evtype_mask = (pmuver == ID_AA64DFR0_PMUVER_8_0) ?
+			ARMV8_PMU_EVTYPE_MASK : ARMV8_1_PMU_EVTYPE_MASK;
 	if (pmu_access_el0_disabled(vcpu))
 		return false;
 
@@ -842,11 +848,11 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 		return false;
 
 	if (p->is_write) {
-		kvm_pmu_set_counter_event_type(vcpu, p->regval, idx);
-		__vcpu_sys_reg(vcpu, reg) = p->regval & ARMV8_PMU_EVTYPE_MASK;
+		kvm_pmu_set_counter_event_type(vcpu,
+					       p->regval & evtype_mask, idx);
 		kvm_vcpu_pmu_restore_guest(vcpu);
 	} else {
-		p->regval = __vcpu_sys_reg(vcpu, reg) & ARMV8_PMU_EVTYPE_MASK;
+		p->regval = __vcpu_sys_reg(vcpu, reg) & evtype_mask;
 	}
 
 	return true;
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 3/3] KVM: arm64: pmu: Enforce PMEVTYPER evtCount size
  2019-12-04 20:44   ` Eric Auger
@ 2019-12-05  9:02     ` Will Deacon
  -1 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2019-12-05  9:02 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, maz, linux-kernel, kvmarm, james.morse,
	andrew.murray, suzuki.poulose, drjones

On Wed, Dec 04, 2019 at 09:44:26PM +0100, Eric Auger wrote:
> ARMv8.1-PMU supports 16-bit evtCount whereas 8.0 only supports
> 10 bits.
> 
> On Seatlle which has an 8.0 PMU implementation, evtCount[15:10]
> are not read as 0, as expected. Fix that by applying a mask on
> the selected event that depends on the PMU version.

Are you sure about that? These bits are RES0 in 8.0 afaict, so this would be
a CPU erratum. Have you checked the SDEN document (I haven't)?

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 3/3] KVM: arm64: pmu: Enforce PMEVTYPER evtCount size
@ 2019-12-05  9:02     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2019-12-05  9:02 UTC (permalink / raw)
  To: Eric Auger; +Cc: maz, linux-kernel, kvmarm, eric.auger.pro

On Wed, Dec 04, 2019 at 09:44:26PM +0100, Eric Auger wrote:
> ARMv8.1-PMU supports 16-bit evtCount whereas 8.0 only supports
> 10 bits.
> 
> On Seatlle which has an 8.0 PMU implementation, evtCount[15:10]
> are not read as 0, as expected. Fix that by applying a mask on
> the selected event that depends on the PMU version.

Are you sure about that? These bits are RES0 in 8.0 afaict, so this would be
a CPU erratum. Have you checked the SDEN document (I haven't)?

Will
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 3/3] KVM: arm64: pmu: Enforce PMEVTYPER evtCount size
  2019-12-05  9:02     ` Will Deacon
@ 2019-12-05  9:37       ` Auger Eric
  -1 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-05  9:37 UTC (permalink / raw)
  To: Will Deacon
  Cc: eric.auger.pro, maz, linux-kernel, kvmarm, james.morse,
	andrew.murray, suzuki.poulose, drjones

Hi Will,

On 12/5/19 10:02 AM, Will Deacon wrote:
> On Wed, Dec 04, 2019 at 09:44:26PM +0100, Eric Auger wrote:
>> ARMv8.1-PMU supports 16-bit evtCount whereas 8.0 only supports
>> 10 bits.
>>
>> On Seatlle which has an 8.0 PMU implementation, evtCount[15:10]
>> are not read as 0, as expected. Fix that by applying a mask on
>> the selected event that depends on the PMU version.
> 
> Are you sure about that? These bits are RES0 in 8.0 afaict, so this would be
> a CPU erratum. Have you checked the SDEN document (I haven't)?

You're right. It is RES0 and not RAZ. My mistake. Please ignore this patch.

Thank you for the feedback.

Eric
> 
> Will
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 3/3] KVM: arm64: pmu: Enforce PMEVTYPER evtCount size
@ 2019-12-05  9:37       ` Auger Eric
  0 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-05  9:37 UTC (permalink / raw)
  To: Will Deacon; +Cc: maz, linux-kernel, kvmarm, eric.auger.pro

Hi Will,

On 12/5/19 10:02 AM, Will Deacon wrote:
> On Wed, Dec 04, 2019 at 09:44:26PM +0100, Eric Auger wrote:
>> ARMv8.1-PMU supports 16-bit evtCount whereas 8.0 only supports
>> 10 bits.
>>
>> On Seatlle which has an 8.0 PMU implementation, evtCount[15:10]
>> are not read as 0, as expected. Fix that by applying a mask on
>> the selected event that depends on the PMU version.
> 
> Are you sure about that? These bits are RES0 in 8.0 afaict, so this would be
> a CPU erratum. Have you checked the SDEN document (I haven't)?

You're right. It is RES0 and not RAZ. My mistake. Please ignore this patch.

Thank you for the feedback.

Eric
> 
> Will
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-04 20:44   ` Eric Auger
@ 2019-12-05  9:43     ` Marc Zyngier
  -1 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-05  9:43 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

Hi Eric,

On 2019-12-04 20:44, Eric Auger wrote:
> At the moment a SW_INCR counter always overflows on 32-bit
> boundary, independently on whether the n+1th counter is
> programmed as CHAIN.
>
> Check whether the SW_INCR counter is a 64b counter and if so,
> implement the 64b logic.
>
> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index c3f8b059881e..7ab477db2f75 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> *vcpu, u64 val)
>
>  	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> +		bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> +

I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
this. But see below:

>  		if (!(val & BIT(i)))
>  			continue;
>  		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> *vcpu, u64 val)
>  			reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>  			reg = lower_32_bits(reg);
>  			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> -			if (!reg)
> +			if (reg) /* no overflow */
> +				continue;
> +			if (chained) {
> +				reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
> +				reg = lower_32_bits(reg);
> +				__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> +				if (reg)
> +					continue;
> +				/* mark an overflow on high counter */
> +				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> +			} else {
> +				/* mark an overflow */
>  				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> +			}
>  		}
>  	}
>  }

I think the whole function is a bit of a mess, and could be better
structured to treat 64bit counters as a first class citizen.

I'm suggesting something along those lines, which tries to
streamline things a bit and keep the flow uniform between the
two word sizes. IMHO, it helps reasonning about it and gives
scope to the ARMv8.5 full 64bit counters... It is of course
completely untested.

Thoughts?

         M.

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 8731dfeced8b..cf371f643ade 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -480,26 +480,43 @@ static void kvm_pmu_perf_overflow(struct 
perf_event *perf_event,
   */
  void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
  {
+	struct kvm_pmu *pmu = &vcpu->arch.pmu;
  	int i;
-	u64 type, enable, reg;

-	if (val == 0)
-		return;
+	/* Weed out disabled counters */
+	val &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);

-	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
+		u64 type, reg;
+		int ovs = i;
+
  		if (!(val & BIT(i)))
  			continue;
-		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
-		       & ARMV8_PMU_EVTYPE_EVENT;
-		if ((type == ARMV8_PMUV3_PERFCTR_SW_INCR)
-		    && (enable & BIT(i))) {
-			reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
-			reg = lower_32_bits(reg);
-			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
-			if (!reg)
-				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
+
+		/* PMSWINC only applies to ... SW_INC! */
+		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
+		type &= ARMV8_PMU_EVTYPE_EVENT;
+		if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
+			continue;
+
+		/* Potential 64bit value */
+		reg = kvm_pmu_get_counter_value(vcpu, i) + 1;
+
+		/* Start by writing back the low 32bits */
+		__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = lower_32_bits(reg);
+
+		/*
+		 * 64bit counter? Write back the upper bits and target
+		 * the overflow bit at the next counter
+		 */
+		if (kvm_pmu_pmc_is_chained(&pmu->pmc[i])) {
+			reg = upper_32_bits(reg);
+			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
+			ovs++;
  		}
+
+		if (!lower_32_bits(reg))
+			__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(ovs);
  	}
  }


-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-05  9:43     ` Marc Zyngier
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-05  9:43 UTC (permalink / raw)
  To: Eric Auger; +Cc: linux-kernel, kvmarm, eric.auger.pro

Hi Eric,

On 2019-12-04 20:44, Eric Auger wrote:
> At the moment a SW_INCR counter always overflows on 32-bit
> boundary, independently on whether the n+1th counter is
> programmed as CHAIN.
>
> Check whether the SW_INCR counter is a 64b counter and if so,
> implement the 64b logic.
>
> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index c3f8b059881e..7ab477db2f75 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> *vcpu, u64 val)
>
>  	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> +		bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> +

I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
this. But see below:

>  		if (!(val & BIT(i)))
>  			continue;
>  		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> *vcpu, u64 val)
>  			reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>  			reg = lower_32_bits(reg);
>  			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> -			if (!reg)
> +			if (reg) /* no overflow */
> +				continue;
> +			if (chained) {
> +				reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
> +				reg = lower_32_bits(reg);
> +				__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> +				if (reg)
> +					continue;
> +				/* mark an overflow on high counter */
> +				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> +			} else {
> +				/* mark an overflow */
>  				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> +			}
>  		}
>  	}
>  }

I think the whole function is a bit of a mess, and could be better
structured to treat 64bit counters as a first class citizen.

I'm suggesting something along those lines, which tries to
streamline things a bit and keep the flow uniform between the
two word sizes. IMHO, it helps reasonning about it and gives
scope to the ARMv8.5 full 64bit counters... It is of course
completely untested.

Thoughts?

         M.

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 8731dfeced8b..cf371f643ade 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -480,26 +480,43 @@ static void kvm_pmu_perf_overflow(struct 
perf_event *perf_event,
   */
  void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
  {
+	struct kvm_pmu *pmu = &vcpu->arch.pmu;
  	int i;
-	u64 type, enable, reg;

-	if (val == 0)
-		return;
+	/* Weed out disabled counters */
+	val &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);

-	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
+		u64 type, reg;
+		int ovs = i;
+
  		if (!(val & BIT(i)))
  			continue;
-		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
-		       & ARMV8_PMU_EVTYPE_EVENT;
-		if ((type == ARMV8_PMUV3_PERFCTR_SW_INCR)
-		    && (enable & BIT(i))) {
-			reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
-			reg = lower_32_bits(reg);
-			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
-			if (!reg)
-				__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
+
+		/* PMSWINC only applies to ... SW_INC! */
+		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
+		type &= ARMV8_PMU_EVTYPE_EVENT;
+		if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
+			continue;
+
+		/* Potential 64bit value */
+		reg = kvm_pmu_get_counter_value(vcpu, i) + 1;
+
+		/* Start by writing back the low 32bits */
+		__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = lower_32_bits(reg);
+
+		/*
+		 * 64bit counter? Write back the upper bits and target
+		 * the overflow bit at the next counter
+		 */
+		if (kvm_pmu_pmc_is_chained(&pmu->pmc[i])) {
+			reg = upper_32_bits(reg);
+			__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
+			ovs++;
  		}
+
+		if (!lower_32_bits(reg))
+			__vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(ovs);
  	}
  }


-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 1/3] KVM: arm64: pmu: Don't increment  SW_INCR if PMCR.E is unset
  2019-12-04 20:44   ` Eric Auger
@ 2019-12-05  9:54     ` Marc Zyngier
  -1 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-05  9:54 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

On 2019-12-04 20:44, Eric Auger wrote:
> The specification says PMSWINC increments PMEVCNTR<n>_EL1 by 1
> if PMEVCNTR<n>_EL0 is enabled and configured to count SW_INCR.
>
> For PMEVCNTR<n>_EL0 to be enabled, we need both PMCNTENSET to
> be set for the corresponding event counter but we also need
> the PMCR.E bit to be set.
>
> Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC 
> register")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  virt/kvm/arm/pmu.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 8731dfeced8b..c3f8b059881e 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -486,6 +486,9 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> *vcpu, u64 val)
>  	if (val == 0)
>  		return;
>
> +	if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
> +		return;
> +
>  	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>  		if (!(val & BIT(i)))

Acked-by: Marc Zyngier <maz@kernel.org>

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 1/3] KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
@ 2019-12-05  9:54     ` Marc Zyngier
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-05  9:54 UTC (permalink / raw)
  To: Eric Auger; +Cc: linux-kernel, kvmarm, eric.auger.pro

On 2019-12-04 20:44, Eric Auger wrote:
> The specification says PMSWINC increments PMEVCNTR<n>_EL1 by 1
> if PMEVCNTR<n>_EL0 is enabled and configured to count SW_INCR.
>
> For PMEVCNTR<n>_EL0 to be enabled, we need both PMCNTENSET to
> be set for the corresponding event counter but we also need
> the PMCR.E bit to be set.
>
> Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC 
> register")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  virt/kvm/arm/pmu.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 8731dfeced8b..c3f8b059881e 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -486,6 +486,9 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> *vcpu, u64 val)
>  	if (val == 0)
>  		return;
>
> +	if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
> +		return;
> +
>  	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>  		if (!(val & BIT(i)))

Acked-by: Marc Zyngier <maz@kernel.org>

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-05  9:43     ` Marc Zyngier
@ 2019-12-05 14:06       ` Auger Eric
  -1 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-05 14:06 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

Hi Marc,

On 12/5/19 10:43 AM, Marc Zyngier wrote:
> Hi Eric,
> 
> On 2019-12-04 20:44, Eric Auger wrote:
>> At the moment a SW_INCR counter always overflows on 32-bit
>> boundary, independently on whether the n+1th counter is
>> programmed as CHAIN.
>>
>> Check whether the SW_INCR counter is a 64b counter and if so,
>> implement the 64b logic.
>>
>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> index c3f8b059881e..7ab477db2f75 100644
>> --- a/virt/kvm/arm/pmu.c
>> +++ b/virt/kvm/arm/pmu.c
>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>> *vcpu, u64 val)
>>
>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>> +
> 
> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> this. But see below:
> 
>>          if (!(val & BIT(i)))
>>              continue;
>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>> *vcpu, u64 val)
>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>              reg = lower_32_bits(reg);
>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>> -            if (!reg)
>> +            if (reg) /* no overflow */
>> +                continue;
>> +            if (chained) {
>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>> +                reg = lower_32_bits(reg);
>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>> +                if (reg)
>> +                    continue;
>> +                /* mark an overflow on high counter */
>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>> +            } else {
>> +                /* mark an overflow */
>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>> +            }
>>          }
>>      }
>>  }
> 
> I think the whole function is a bit of a mess, and could be better
> structured to treat 64bit counters as a first class citizen.
> 
> I'm suggesting something along those lines, which tries to
> streamline things a bit and keep the flow uniform between the
> two word sizes. IMHO, it helps reasonning about it and gives
> scope to the ARMv8.5 full 64bit counters... It is of course
> completely untested.

Looks OK to me as well. One remark though, don't we need to test if the
n+1th reg is enabled before incrementing it?

Thanks

Eric
> 
> Thoughts?
> 
>         M.
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 8731dfeced8b..cf371f643ade 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -480,26 +480,43 @@ static void kvm_pmu_perf_overflow(struct
> perf_event *perf_event,
>   */
>  void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
>  {
> +    struct kvm_pmu *pmu = &vcpu->arch.pmu;
>      int i;
> -    u64 type, enable, reg;
> 
> -    if (val == 0)
> -        return;
> +    /* Weed out disabled counters */
> +    val &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> 
> -    enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> +        u64 type, reg;
> +        int ovs = i;
> +
>          if (!(val & BIT(i)))
>              continue;
> -        type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> -               & ARMV8_PMU_EVTYPE_EVENT;
> -        if ((type == ARMV8_PMUV3_PERFCTR_SW_INCR)
> -            && (enable & BIT(i))) {
> -            reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> -            reg = lower_32_bits(reg);
> -            __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> -            if (!reg)
> -                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> +
> +        /* PMSWINC only applies to ... SW_INC! */
> +        type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
> +        type &= ARMV8_PMU_EVTYPE_EVENT;
> +        if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
> +            continue;
> +
> +        /* Potential 64bit value */
> +        reg = kvm_pmu_get_counter_value(vcpu, i) + 1;
> +
> +        /* Start by writing back the low 32bits */
> +        __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = lower_32_bits(reg);
> +
> +        /*
> +         * 64bit counter? Write back the upper bits and target
> +         * the overflow bit at the next counter
> +         */
> +        if (kvm_pmu_pmc_is_chained(&pmu->pmc[i])) {
> +            reg = upper_32_bits(reg);
> +            __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> +            ovs++;
>          }
> +
> +        if (!lower_32_bits(reg))
> +            __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(ovs);
>      }
>  }
> 
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-05 14:06       ` Auger Eric
  0 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-05 14:06 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, kvmarm, eric.auger.pro

Hi Marc,

On 12/5/19 10:43 AM, Marc Zyngier wrote:
> Hi Eric,
> 
> On 2019-12-04 20:44, Eric Auger wrote:
>> At the moment a SW_INCR counter always overflows on 32-bit
>> boundary, independently on whether the n+1th counter is
>> programmed as CHAIN.
>>
>> Check whether the SW_INCR counter is a 64b counter and if so,
>> implement the 64b logic.
>>
>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> index c3f8b059881e..7ab477db2f75 100644
>> --- a/virt/kvm/arm/pmu.c
>> +++ b/virt/kvm/arm/pmu.c
>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>> *vcpu, u64 val)
>>
>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>> +
> 
> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> this. But see below:
> 
>>          if (!(val & BIT(i)))
>>              continue;
>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>> *vcpu, u64 val)
>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>              reg = lower_32_bits(reg);
>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>> -            if (!reg)
>> +            if (reg) /* no overflow */
>> +                continue;
>> +            if (chained) {
>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>> +                reg = lower_32_bits(reg);
>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>> +                if (reg)
>> +                    continue;
>> +                /* mark an overflow on high counter */
>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>> +            } else {
>> +                /* mark an overflow */
>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>> +            }
>>          }
>>      }
>>  }
> 
> I think the whole function is a bit of a mess, and could be better
> structured to treat 64bit counters as a first class citizen.
> 
> I'm suggesting something along those lines, which tries to
> streamline things a bit and keep the flow uniform between the
> two word sizes. IMHO, it helps reasonning about it and gives
> scope to the ARMv8.5 full 64bit counters... It is of course
> completely untested.

Looks OK to me as well. One remark though, don't we need to test if the
n+1th reg is enabled before incrementing it?

Thanks

Eric
> 
> Thoughts?
> 
>         M.
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 8731dfeced8b..cf371f643ade 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -480,26 +480,43 @@ static void kvm_pmu_perf_overflow(struct
> perf_event *perf_event,
>   */
>  void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
>  {
> +    struct kvm_pmu *pmu = &vcpu->arch.pmu;
>      int i;
> -    u64 type, enable, reg;
> 
> -    if (val == 0)
> -        return;
> +    /* Weed out disabled counters */
> +    val &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> 
> -    enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> +        u64 type, reg;
> +        int ovs = i;
> +
>          if (!(val & BIT(i)))
>              continue;
> -        type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> -               & ARMV8_PMU_EVTYPE_EVENT;
> -        if ((type == ARMV8_PMUV3_PERFCTR_SW_INCR)
> -            && (enable & BIT(i))) {
> -            reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> -            reg = lower_32_bits(reg);
> -            __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> -            if (!reg)
> -                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> +
> +        /* PMSWINC only applies to ... SW_INC! */
> +        type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
> +        type &= ARMV8_PMU_EVTYPE_EVENT;
> +        if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
> +            continue;
> +
> +        /* Potential 64bit value */
> +        reg = kvm_pmu_get_counter_value(vcpu, i) + 1;
> +
> +        /* Start by writing back the low 32bits */
> +        __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = lower_32_bits(reg);
> +
> +        /*
> +         * 64bit counter? Write back the upper bits and target
> +         * the overflow bit at the next counter
> +         */
> +        if (kvm_pmu_pmc_is_chained(&pmu->pmc[i])) {
> +            reg = upper_32_bits(reg);
> +            __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> +            ovs++;
>          }
> +
> +        if (!lower_32_bits(reg))
> +            __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(ovs);
>      }
>  }
> 
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-05 14:06       ` Auger Eric
@ 2019-12-05 14:52         ` Marc Zyngier
  -1 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-05 14:52 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

On 2019-12-05 14:06, Auger Eric wrote:
> Hi Marc,
>
> On 12/5/19 10:43 AM, Marc Zyngier wrote:
>> Hi Eric,
>>
>> On 2019-12-04 20:44, Eric Auger wrote:
>>> At the moment a SW_INCR counter always overflows on 32-bit
>>> boundary, independently on whether the n+1th counter is
>>> programmed as CHAIN.
>>>
>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>> implement the 64b logic.
>>>
>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU 
>>> counters")
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>> index c3f8b059881e..7ab477db2f75 100644
>>> --- a/virt/kvm/arm/pmu.c
>>> +++ b/virt/kvm/arm/pmu.c
>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>> *vcpu, u64 val)
>>>
>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>> +
>>
>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>> this. But see below:
>>
>>>          if (!(val & BIT(i)))
>>>              continue;
>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct 
>>> kvm_vcpu
>>> *vcpu, u64 val)
>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>              reg = lower_32_bits(reg);
>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>> -            if (!reg)
>>> +            if (reg) /* no overflow */
>>> +                continue;
>>> +            if (chained) {
>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) 
>>> + 1;
>>> +                reg = lower_32_bits(reg);
>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>> +                if (reg)
>>> +                    continue;
>>> +                /* mark an overflow on high counter */
>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>> +            } else {
>>> +                /* mark an overflow */
>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>> +            }
>>>          }
>>>      }
>>>  }
>>
>> I think the whole function is a bit of a mess, and could be better
>> structured to treat 64bit counters as a first class citizen.
>>
>> I'm suggesting something along those lines, which tries to
>> streamline things a bit and keep the flow uniform between the
>> two word sizes. IMHO, it helps reasonning about it and gives
>> scope to the ARMv8.5 full 64bit counters... It is of course
>> completely untested.
>
> Looks OK to me as well. One remark though, don't we need to test if 
> the
> n+1th reg is enabled before incrementing it?

Hmmm. I'm not sure. I think we should make sure that we don't flag
a counter as being chained if the odd counter is disabled, rather
than checking it here. As long as the odd counter is not chained
*and* enabled, we shouldn't touch it.

Again, untested:

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index cf371f643ade..47366817cd2a 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -15,6 +15,7 @@
  #include <kvm/arm_vgic.h>

  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 
select_idx);
+static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 
select_idx);

  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1

@@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu 
*vcpu, u64 val)
  		 * For high counters of chained events we must recreate the
  		 * perf event with the long (64bit) attribute set.
  		 */
+		kvm_pmu_update_pmc_chained(vcpu, i);
  		if (kvm_pmu_pmc_is_chained(pmc) &&
  		    kvm_pmu_idx_is_high_counter(i)) {
  			kvm_pmu_create_perf_event(vcpu, i);
@@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct 
kvm_vcpu *vcpu, u64 select_idx)
  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];

-	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
+	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
+	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
  		/*
  		 * During promotion from !chained to chained we must ensure
  		 * the adjacent counter is stopped and its event destroyed

What do you think?

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-05 14:52         ` Marc Zyngier
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-05 14:52 UTC (permalink / raw)
  To: Auger Eric; +Cc: linux-kernel, kvmarm, eric.auger.pro

On 2019-12-05 14:06, Auger Eric wrote:
> Hi Marc,
>
> On 12/5/19 10:43 AM, Marc Zyngier wrote:
>> Hi Eric,
>>
>> On 2019-12-04 20:44, Eric Auger wrote:
>>> At the moment a SW_INCR counter always overflows on 32-bit
>>> boundary, independently on whether the n+1th counter is
>>> programmed as CHAIN.
>>>
>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>> implement the 64b logic.
>>>
>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU 
>>> counters")
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>> index c3f8b059881e..7ab477db2f75 100644
>>> --- a/virt/kvm/arm/pmu.c
>>> +++ b/virt/kvm/arm/pmu.c
>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>> *vcpu, u64 val)
>>>
>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>> +
>>
>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>> this. But see below:
>>
>>>          if (!(val & BIT(i)))
>>>              continue;
>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct 
>>> kvm_vcpu
>>> *vcpu, u64 val)
>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>              reg = lower_32_bits(reg);
>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>> -            if (!reg)
>>> +            if (reg) /* no overflow */
>>> +                continue;
>>> +            if (chained) {
>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) 
>>> + 1;
>>> +                reg = lower_32_bits(reg);
>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>> +                if (reg)
>>> +                    continue;
>>> +                /* mark an overflow on high counter */
>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>> +            } else {
>>> +                /* mark an overflow */
>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>> +            }
>>>          }
>>>      }
>>>  }
>>
>> I think the whole function is a bit of a mess, and could be better
>> structured to treat 64bit counters as a first class citizen.
>>
>> I'm suggesting something along those lines, which tries to
>> streamline things a bit and keep the flow uniform between the
>> two word sizes. IMHO, it helps reasonning about it and gives
>> scope to the ARMv8.5 full 64bit counters... It is of course
>> completely untested.
>
> Looks OK to me as well. One remark though, don't we need to test if 
> the
> n+1th reg is enabled before incrementing it?

Hmmm. I'm not sure. I think we should make sure that we don't flag
a counter as being chained if the odd counter is disabled, rather
than checking it here. As long as the odd counter is not chained
*and* enabled, we shouldn't touch it.

Again, untested:

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index cf371f643ade..47366817cd2a 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -15,6 +15,7 @@
  #include <kvm/arm_vgic.h>

  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 
select_idx);
+static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 
select_idx);

  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1

@@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu 
*vcpu, u64 val)
  		 * For high counters of chained events we must recreate the
  		 * perf event with the long (64bit) attribute set.
  		 */
+		kvm_pmu_update_pmc_chained(vcpu, i);
  		if (kvm_pmu_pmc_is_chained(pmc) &&
  		    kvm_pmu_idx_is_high_counter(i)) {
  			kvm_pmu_create_perf_event(vcpu, i);
@@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct 
kvm_vcpu *vcpu, u64 select_idx)
  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];

-	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
+	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
+	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
  		/*
  		 * During promotion from !chained to chained we must ensure
  		 * the adjacent counter is stopped and its event destroyed

What do you think?

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-05 14:52         ` Marc Zyngier
@ 2019-12-05 19:01           ` Auger Eric
  -1 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-05 19:01 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

Hi Marc,

On 12/5/19 3:52 PM, Marc Zyngier wrote:
> On 2019-12-05 14:06, Auger Eric wrote:
>> Hi Marc,
>>
>> On 12/5/19 10:43 AM, Marc Zyngier wrote:
>>> Hi Eric,
>>>
>>> On 2019-12-04 20:44, Eric Auger wrote:
>>>> At the moment a SW_INCR counter always overflows on 32-bit
>>>> boundary, independently on whether the n+1th counter is
>>>> programmed as CHAIN.
>>>>
>>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>>> implement the 64b logic.
>>>>
>>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> ---
>>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>> index c3f8b059881e..7ab477db2f75 100644
>>>> --- a/virt/kvm/arm/pmu.c
>>>> +++ b/virt/kvm/arm/pmu.c
>>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>> *vcpu, u64 val)
>>>>
>>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>>> +
>>>
>>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>>> this. But see below:
>>>
>>>>          if (!(val & BIT(i)))
>>>>              continue;
>>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>> *vcpu, u64 val)
>>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>>              reg = lower_32_bits(reg);
>>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>>> -            if (!reg)
>>>> +            if (reg) /* no overflow */
>>>> +                continue;
>>>> +            if (chained) {
>>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>>>> +                reg = lower_32_bits(reg);
>>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>>> +                if (reg)
>>>> +                    continue;
>>>> +                /* mark an overflow on high counter */
>>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>>> +            } else {
>>>> +                /* mark an overflow */
>>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>>> +            }
>>>>          }
>>>>      }
>>>>  }
>>>
>>> I think the whole function is a bit of a mess, and could be better
>>> structured to treat 64bit counters as a first class citizen.
>>>
>>> I'm suggesting something along those lines, which tries to
>>> streamline things a bit and keep the flow uniform between the
>>> two word sizes. IMHO, it helps reasonning about it and gives
>>> scope to the ARMv8.5 full 64bit counters... It is of course
>>> completely untested.
>>
>> Looks OK to me as well. One remark though, don't we need to test if the
>> n+1th reg is enabled before incrementing it?
> 
> Hmmm. I'm not sure. I think we should make sure that we don't flag
> a counter as being chained if the odd counter is disabled, rather
> than checking it here. As long as the odd counter is not chained
> *and* enabled, we shouldn't touch it.>
> Again, untested:
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index cf371f643ade..47366817cd2a 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -15,6 +15,7 @@
>  #include <kvm/arm_vgic.h>
> 
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> select_idx);
> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> select_idx);
> 
>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> 
> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
> *vcpu, u64 val)
>           * For high counters of chained events we must recreate the
>           * perf event with the long (64bit) attribute set.
>           */
> +        kvm_pmu_update_pmc_chained(vcpu, i);
>          if (kvm_pmu_pmc_is_chained(pmc) &&
>              kvm_pmu_idx_is_high_counter(i)) {
>              kvm_pmu_create_perf_event(vcpu, i);
> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> kvm_vcpu *vcpu, u64 select_idx)
>      struct kvm_pmu *pmu = &vcpu->arch.pmu;
>      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> 
> -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {

In create_perf_event(), has_chain_evtype() is used and a 64b sample
period would be chosen even if the counters are disjoined (since the odd
is disabled). We would need to use pmc_is_chained() instead.

With perf_events, the check of whether the odd register is enabled is
properly done (create_perf_event). Then I understand whenever there is a
change in enable state or type we delete the previous perf event and
re-create a new one. Enable state check just is missing for SW_INCR.

Some other questions:
- do we need a perf event to be created even if the counter is not
enabled? For instance on counter resets, create_perf_events get called.
- also actions are made for counters which are not implemented. loop
until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
bitmask of supported counters stored before pmu readiness?
I can propose such changes if you think they are valuable.

Thanks

Eric

>          /*
>           * During promotion from !chained to chained we must ensure
>           * the adjacent counter is stopped and its event destroyed
> 
> What do you think?
> 
>         M.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-05 19:01           ` Auger Eric
  0 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-05 19:01 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, kvmarm, eric.auger.pro

Hi Marc,

On 12/5/19 3:52 PM, Marc Zyngier wrote:
> On 2019-12-05 14:06, Auger Eric wrote:
>> Hi Marc,
>>
>> On 12/5/19 10:43 AM, Marc Zyngier wrote:
>>> Hi Eric,
>>>
>>> On 2019-12-04 20:44, Eric Auger wrote:
>>>> At the moment a SW_INCR counter always overflows on 32-bit
>>>> boundary, independently on whether the n+1th counter is
>>>> programmed as CHAIN.
>>>>
>>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>>> implement the 64b logic.
>>>>
>>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> ---
>>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>> index c3f8b059881e..7ab477db2f75 100644
>>>> --- a/virt/kvm/arm/pmu.c
>>>> +++ b/virt/kvm/arm/pmu.c
>>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>> *vcpu, u64 val)
>>>>
>>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>>> +
>>>
>>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>>> this. But see below:
>>>
>>>>          if (!(val & BIT(i)))
>>>>              continue;
>>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>> *vcpu, u64 val)
>>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>>              reg = lower_32_bits(reg);
>>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>>> -            if (!reg)
>>>> +            if (reg) /* no overflow */
>>>> +                continue;
>>>> +            if (chained) {
>>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>>>> +                reg = lower_32_bits(reg);
>>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>>> +                if (reg)
>>>> +                    continue;
>>>> +                /* mark an overflow on high counter */
>>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>>> +            } else {
>>>> +                /* mark an overflow */
>>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>>> +            }
>>>>          }
>>>>      }
>>>>  }
>>>
>>> I think the whole function is a bit of a mess, and could be better
>>> structured to treat 64bit counters as a first class citizen.
>>>
>>> I'm suggesting something along those lines, which tries to
>>> streamline things a bit and keep the flow uniform between the
>>> two word sizes. IMHO, it helps reasonning about it and gives
>>> scope to the ARMv8.5 full 64bit counters... It is of course
>>> completely untested.
>>
>> Looks OK to me as well. One remark though, don't we need to test if the
>> n+1th reg is enabled before incrementing it?
> 
> Hmmm. I'm not sure. I think we should make sure that we don't flag
> a counter as being chained if the odd counter is disabled, rather
> than checking it here. As long as the odd counter is not chained
> *and* enabled, we shouldn't touch it.>
> Again, untested:
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index cf371f643ade..47366817cd2a 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -15,6 +15,7 @@
>  #include <kvm/arm_vgic.h>
> 
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> select_idx);
> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> select_idx);
> 
>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> 
> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
> *vcpu, u64 val)
>           * For high counters of chained events we must recreate the
>           * perf event with the long (64bit) attribute set.
>           */
> +        kvm_pmu_update_pmc_chained(vcpu, i);
>          if (kvm_pmu_pmc_is_chained(pmc) &&
>              kvm_pmu_idx_is_high_counter(i)) {
>              kvm_pmu_create_perf_event(vcpu, i);
> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> kvm_vcpu *vcpu, u64 select_idx)
>      struct kvm_pmu *pmu = &vcpu->arch.pmu;
>      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> 
> -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {

In create_perf_event(), has_chain_evtype() is used and a 64b sample
period would be chosen even if the counters are disjoined (since the odd
is disabled). We would need to use pmc_is_chained() instead.

With perf_events, the check of whether the odd register is enabled is
properly done (create_perf_event). Then I understand whenever there is a
change in enable state or type we delete the previous perf event and
re-create a new one. Enable state check just is missing for SW_INCR.

Some other questions:
- do we need a perf event to be created even if the counter is not
enabled? For instance on counter resets, create_perf_events get called.
- also actions are made for counters which are not implemented. loop
until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
bitmask of supported counters stored before pmu readiness?
I can propose such changes if you think they are valuable.

Thanks

Eric

>          /*
>           * During promotion from !chained to chained we must ensure
>           * the adjacent counter is stopped and its event destroyed
> 
> What do you think?
> 
>         M.

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-05 19:01           ` Auger Eric
@ 2019-12-06  9:56             ` Auger Eric
  -1 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-06  9:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

Hi,

On 12/5/19 8:01 PM, Auger Eric wrote:
> Hi Marc,
> 
> On 12/5/19 3:52 PM, Marc Zyngier wrote:
>> On 2019-12-05 14:06, Auger Eric wrote:
>>> Hi Marc,
>>>
>>> On 12/5/19 10:43 AM, Marc Zyngier wrote:
>>>> Hi Eric,
>>>>
>>>> On 2019-12-04 20:44, Eric Auger wrote:
>>>>> At the moment a SW_INCR counter always overflows on 32-bit
>>>>> boundary, independently on whether the n+1th counter is
>>>>> programmed as CHAIN.
>>>>>
>>>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>>>> implement the 64b logic.
>>>>>
>>>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>> ---
>>>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>>> index c3f8b059881e..7ab477db2f75 100644
>>>>> --- a/virt/kvm/arm/pmu.c
>>>>> +++ b/virt/kvm/arm/pmu.c
>>>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>> *vcpu, u64 val)
>>>>>
>>>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>>>> +
>>>>
>>>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>>>> this. But see below:
>>>>
>>>>>          if (!(val & BIT(i)))
>>>>>              continue;
>>>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>> *vcpu, u64 val)
>>>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>>>              reg = lower_32_bits(reg);
>>>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>>>> -            if (!reg)
>>>>> +            if (reg) /* no overflow */
>>>>> +                continue;
>>>>> +            if (chained) {
>>>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>>>>> +                reg = lower_32_bits(reg);
>>>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>>>> +                if (reg)
>>>>> +                    continue;
>>>>> +                /* mark an overflow on high counter */
>>>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>>>> +            } else {
>>>>> +                /* mark an overflow */
>>>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>>>> +            }
>>>>>          }
>>>>>      }
>>>>>  }
>>>>
>>>> I think the whole function is a bit of a mess, and could be better
>>>> structured to treat 64bit counters as a first class citizen.
>>>>
>>>> I'm suggesting something along those lines, which tries to
>>>> streamline things a bit and keep the flow uniform between the
>>>> two word sizes. IMHO, it helps reasonning about it and gives
>>>> scope to the ARMv8.5 full 64bit counters... It is of course
>>>> completely untested.
>>>
>>> Looks OK to me as well. One remark though, don't we need to test if the
>>> n+1th reg is enabled before incrementing it?
>>
>> Hmmm. I'm not sure. I think we should make sure that we don't flag
>> a counter as being chained if the odd counter is disabled, rather
>> than checking it here. As long as the odd counter is not chained
>> *and* enabled, we shouldn't touch it.>
>> Again, untested:
>>
>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> index cf371f643ade..47366817cd2a 100644
>> --- a/virt/kvm/arm/pmu.c
>> +++ b/virt/kvm/arm/pmu.c
>> @@ -15,6 +15,7 @@
>>  #include <kvm/arm_vgic.h>
>>
>>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
>> select_idx);
>> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
>> select_idx);
>>
>>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>>
>> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
>> *vcpu, u64 val)
>>           * For high counters of chained events we must recreate the
>>           * perf event with the long (64bit) attribute set.
>>           */
>> +        kvm_pmu_update_pmc_chained(vcpu, i);
>>          if (kvm_pmu_pmc_is_chained(pmc) &&
>>              kvm_pmu_idx_is_high_counter(i)) {
>>              kvm_pmu_create_perf_event(vcpu, i);
>> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
>> kvm_vcpu *vcpu, u64 select_idx)
>>      struct kvm_pmu *pmu = &vcpu->arch.pmu;
>>      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>>
>> -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
>> +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
>> +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
> 
> In create_perf_event(), has_chain_evtype() is used and a 64b sample
> period would be chosen even if the counters are disjoined (since the odd
> is disabled). We would need to use pmc_is_chained() instead.
> 
> With perf_events, the check of whether the odd register is enabled is
> properly done (create_perf_event).

Hum that's not fully true. If we do not enable the CHAIN odd one but
only the event one, the correct 32b perf counter is created. But when
reading the odd reg after overflow we get the incremented value
(get_counter_value).

Thanks

Eric

 Then I understand whenever there is a
> change in enable state or type we delete the previous perf event and
> re-create a new one. Enable state check just is missing for SW_INCR.
> 
> Some other questions:
> - do we need a perf event to be created even if the counter is not
> enabled? For instance on counter resets, create_perf_events get called.
> - also actions are made for counters which are not implemented. loop
> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
> bitmask of supported counters stored before pmu readiness?
> I can propose such changes if you think they are valuable.
> 
> Thanks
> 
> Eric
> 
>>          /*
>>           * During promotion from !chained to chained we must ensure
>>           * the adjacent counter is stopped and its event destroyed
>>
>> What do you think?
>>
>>         M.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-06  9:56             ` Auger Eric
  0 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2019-12-06  9:56 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, kvmarm, eric.auger.pro

Hi,

On 12/5/19 8:01 PM, Auger Eric wrote:
> Hi Marc,
> 
> On 12/5/19 3:52 PM, Marc Zyngier wrote:
>> On 2019-12-05 14:06, Auger Eric wrote:
>>> Hi Marc,
>>>
>>> On 12/5/19 10:43 AM, Marc Zyngier wrote:
>>>> Hi Eric,
>>>>
>>>> On 2019-12-04 20:44, Eric Auger wrote:
>>>>> At the moment a SW_INCR counter always overflows on 32-bit
>>>>> boundary, independently on whether the n+1th counter is
>>>>> programmed as CHAIN.
>>>>>
>>>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>>>> implement the 64b logic.
>>>>>
>>>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>> ---
>>>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>>> index c3f8b059881e..7ab477db2f75 100644
>>>>> --- a/virt/kvm/arm/pmu.c
>>>>> +++ b/virt/kvm/arm/pmu.c
>>>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>> *vcpu, u64 val)
>>>>>
>>>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>>>> +
>>>>
>>>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>>>> this. But see below:
>>>>
>>>>>          if (!(val & BIT(i)))
>>>>>              continue;
>>>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>> *vcpu, u64 val)
>>>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>>>              reg = lower_32_bits(reg);
>>>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>>>> -            if (!reg)
>>>>> +            if (reg) /* no overflow */
>>>>> +                continue;
>>>>> +            if (chained) {
>>>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>>>>> +                reg = lower_32_bits(reg);
>>>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>>>> +                if (reg)
>>>>> +                    continue;
>>>>> +                /* mark an overflow on high counter */
>>>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>>>> +            } else {
>>>>> +                /* mark an overflow */
>>>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>>>> +            }
>>>>>          }
>>>>>      }
>>>>>  }
>>>>
>>>> I think the whole function is a bit of a mess, and could be better
>>>> structured to treat 64bit counters as a first class citizen.
>>>>
>>>> I'm suggesting something along those lines, which tries to
>>>> streamline things a bit and keep the flow uniform between the
>>>> two word sizes. IMHO, it helps reasonning about it and gives
>>>> scope to the ARMv8.5 full 64bit counters... It is of course
>>>> completely untested.
>>>
>>> Looks OK to me as well. One remark though, don't we need to test if the
>>> n+1th reg is enabled before incrementing it?
>>
>> Hmmm. I'm not sure. I think we should make sure that we don't flag
>> a counter as being chained if the odd counter is disabled, rather
>> than checking it here. As long as the odd counter is not chained
>> *and* enabled, we shouldn't touch it.>
>> Again, untested:
>>
>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> index cf371f643ade..47366817cd2a 100644
>> --- a/virt/kvm/arm/pmu.c
>> +++ b/virt/kvm/arm/pmu.c
>> @@ -15,6 +15,7 @@
>>  #include <kvm/arm_vgic.h>
>>
>>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
>> select_idx);
>> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
>> select_idx);
>>
>>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>>
>> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
>> *vcpu, u64 val)
>>           * For high counters of chained events we must recreate the
>>           * perf event with the long (64bit) attribute set.
>>           */
>> +        kvm_pmu_update_pmc_chained(vcpu, i);
>>          if (kvm_pmu_pmc_is_chained(pmc) &&
>>              kvm_pmu_idx_is_high_counter(i)) {
>>              kvm_pmu_create_perf_event(vcpu, i);
>> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
>> kvm_vcpu *vcpu, u64 select_idx)
>>      struct kvm_pmu *pmu = &vcpu->arch.pmu;
>>      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>>
>> -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
>> +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
>> +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
> 
> In create_perf_event(), has_chain_evtype() is used and a 64b sample
> period would be chosen even if the counters are disjoined (since the odd
> is disabled). We would need to use pmc_is_chained() instead.
> 
> With perf_events, the check of whether the odd register is enabled is
> properly done (create_perf_event).

Hum that's not fully true. If we do not enable the CHAIN odd one but
only the event one, the correct 32b perf counter is created. But when
reading the odd reg after overflow we get the incremented value
(get_counter_value).

Thanks

Eric

 Then I understand whenever there is a
> change in enable state or type we delete the previous perf event and
> re-create a new one. Enable state check just is missing for SW_INCR.
> 
> Some other questions:
> - do we need a perf event to be created even if the counter is not
> enabled? For instance on counter resets, create_perf_events get called.
> - also actions are made for counters which are not implemented. loop
> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
> bitmask of supported counters stored before pmu readiness?
> I can propose such changes if you think they are valuable.
> 
> Thanks
> 
> Eric
> 
>>          /*
>>           * During promotion from !chained to chained we must ensure
>>           * the adjacent counter is stopped and its event destroyed
>>
>> What do you think?
>>
>>         M.

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 1/3] KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
  2019-12-04 20:44   ` Eric Auger
@ 2019-12-06 10:35     ` Andrew Murray
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 10:35 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, maz, linux-kernel, kvmarm, james.morse,
	suzuki.poulose, drjones

On Wed, Dec 04, 2019 at 09:44:24PM +0100, Eric Auger wrote:
> The specification says PMSWINC increments PMEVCNTR<n>_EL1 by 1
> if PMEVCNTR<n>_EL0 is enabled and configured to count SW_INCR.
> 
> For PMEVCNTR<n>_EL0 to be enabled, we need both PMCNTENSET to
> be set for the corresponding event counter but we also need
> the PMCR.E bit to be set.
> 
> Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC register")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  virt/kvm/arm/pmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 8731dfeced8b..c3f8b059881e 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -486,6 +486,9 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
>  	if (val == 0)
>  		return;
>  
> +	if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
> +		return;
> +

Reviewed-by: Andrew Murray <andrew.murray@arm.com>


>  	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>  		if (!(val & BIT(i)))
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 1/3] KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
@ 2019-12-06 10:35     ` Andrew Murray
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 10:35 UTC (permalink / raw)
  To: Eric Auger; +Cc: maz, linux-kernel, kvmarm, eric.auger.pro

On Wed, Dec 04, 2019 at 09:44:24PM +0100, Eric Auger wrote:
> The specification says PMSWINC increments PMEVCNTR<n>_EL1 by 1
> if PMEVCNTR<n>_EL0 is enabled and configured to count SW_INCR.
> 
> For PMEVCNTR<n>_EL0 to be enabled, we need both PMCNTENSET to
> be set for the corresponding event counter but we also need
> the PMCR.E bit to be set.
> 
> Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC register")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  virt/kvm/arm/pmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 8731dfeced8b..c3f8b059881e 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -486,6 +486,9 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
>  	if (val == 0)
>  		return;
>  
> +	if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
> +		return;
> +

Reviewed-by: Andrew Murray <andrew.murray@arm.com>


>  	enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>  	for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>  		if (!(val & BIT(i)))
> -- 
> 2.20.1
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-05 14:52         ` Marc Zyngier
@ 2019-12-06 15:21           ` Andrew Murray
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 15:21 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Auger Eric, eric.auger.pro, linux-kernel, kvmarm, james.morse,
	suzuki.poulose, drjones

On Thu, Dec 05, 2019 at 02:52:26PM +0000, Marc Zyngier wrote:
> On 2019-12-05 14:06, Auger Eric wrote:
> > Hi Marc,
> > 
> > On 12/5/19 10:43 AM, Marc Zyngier wrote:
> > > Hi Eric,
> > > 
> > > On 2019-12-04 20:44, Eric Auger wrote:
> > > > At the moment a SW_INCR counter always overflows on 32-bit
> > > > boundary, independently on whether the n+1th counter is
> > > > programmed as CHAIN.
> > > > 
> > > > Check whether the SW_INCR counter is a 64b counter and if so,
> > > > implement the 64b logic.
> > > > 
> > > > Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU
> > > > counters")
> > > > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > > > ---
> > > >  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
> > > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > > > index c3f8b059881e..7ab477db2f75 100644
> > > > --- a/virt/kvm/arm/pmu.c
> > > > +++ b/virt/kvm/arm/pmu.c
> > > > @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> > > > *vcpu, u64 val)
> > > > 
> > > >      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> > > >      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> > > > +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> > > > +
> > > 
> > > I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> > > this. But see below:
> > > 
> > > >          if (!(val & BIT(i)))
> > > >              continue;
> > > >          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> > > > @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct
> > > > kvm_vcpu
> > > > *vcpu, u64 val)
> > > >              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> > > >              reg = lower_32_bits(reg);
> > > >              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> > > > -            if (!reg)
> > > > +            if (reg) /* no overflow */
> > > > +                continue;
> > > > +            if (chained) {
> > > > +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i +
> > > > 1) + 1;
> > > > +                reg = lower_32_bits(reg);
> > > > +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> > > > +                if (reg)
> > > > +                    continue;
> > > > +                /* mark an overflow on high counter */
> > > > +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> > > > +            } else {
> > > > +                /* mark an overflow */
> > > >                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> > > > +            }
> > > >          }
> > > >      }
> > > >  }
> > > 
> > > I think the whole function is a bit of a mess, and could be better
> > > structured to treat 64bit counters as a first class citizen.
> > > 
> > > I'm suggesting something along those lines, which tries to
> > > streamline things a bit and keep the flow uniform between the
> > > two word sizes. IMHO, it helps reasonning about it and gives
> > > scope to the ARMv8.5 full 64bit counters... It is of course
> > > completely untested.
> > 
> > Looks OK to me as well. One remark though, don't we need to test if the
> > n+1th reg is enabled before incrementing it?

Indeed - we don't want to indicate an overflow on a disabled counter.


> 
> Hmmm. I'm not sure. I think we should make sure that we don't flag
> a counter as being chained if the odd counter is disabled, rather
> than checking it here. As long as the odd counter is not chained
> *and* enabled, we shouldn't touch it.

Does this mean that we don't care if the low counter is enabled or not
when deciding if the pair is chained?

I would find the code easier to follow if we had an explicit 'is the
high counter enabled here' check (at the point of deciding where to
put the overflow).


> 
> Again, untested:
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index cf371f643ade..47366817cd2a 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -15,6 +15,7 @@
>  #include <kvm/arm_vgic.h>
> 
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> select_idx);
> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> select_idx);
> 
>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> 
> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu *vcpu,
> u64 val)
>  		 * For high counters of chained events we must recreate the
>  		 * perf event with the long (64bit) attribute set.
>  		 */
> +		kvm_pmu_update_pmc_chained(vcpu, i);
>  		if (kvm_pmu_pmc_is_chained(pmc) &&
>  		    kvm_pmu_idx_is_high_counter(i)) {
>  			kvm_pmu_create_perf_event(vcpu, i);
> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct kvm_vcpu
> *vcpu, u64 select_idx)
>  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
>  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> 
> -	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> +	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> +	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {

I.e. here we don't care what the state of enablement is for the low counter.

Also at present, this may break the following use-case

 - User creates and uses a pair of chained counters
 - User disables odd/high counter
 - User reads values of both counters
 - User rewrites CHAIN event to odd/high counter OR user re-enables just the even/low counter
 - User reads value of both counters <- this may now different to the last read

Thanks,

Andrew Murray

>  		/*
>  		 * During promotion from !chained to chained we must ensure
>  		 * the adjacent counter is stopped and its event destroyed
> 
> What do you think?
> 
>         M.
> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-06 15:21           ` Andrew Murray
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 15:21 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, kvmarm, eric.auger.pro

On Thu, Dec 05, 2019 at 02:52:26PM +0000, Marc Zyngier wrote:
> On 2019-12-05 14:06, Auger Eric wrote:
> > Hi Marc,
> > 
> > On 12/5/19 10:43 AM, Marc Zyngier wrote:
> > > Hi Eric,
> > > 
> > > On 2019-12-04 20:44, Eric Auger wrote:
> > > > At the moment a SW_INCR counter always overflows on 32-bit
> > > > boundary, independently on whether the n+1th counter is
> > > > programmed as CHAIN.
> > > > 
> > > > Check whether the SW_INCR counter is a 64b counter and if so,
> > > > implement the 64b logic.
> > > > 
> > > > Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU
> > > > counters")
> > > > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > > > ---
> > > >  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
> > > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > > > index c3f8b059881e..7ab477db2f75 100644
> > > > --- a/virt/kvm/arm/pmu.c
> > > > +++ b/virt/kvm/arm/pmu.c
> > > > @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> > > > *vcpu, u64 val)
> > > > 
> > > >      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> > > >      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> > > > +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> > > > +
> > > 
> > > I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> > > this. But see below:
> > > 
> > > >          if (!(val & BIT(i)))
> > > >              continue;
> > > >          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> > > > @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct
> > > > kvm_vcpu
> > > > *vcpu, u64 val)
> > > >              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> > > >              reg = lower_32_bits(reg);
> > > >              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> > > > -            if (!reg)
> > > > +            if (reg) /* no overflow */
> > > > +                continue;
> > > > +            if (chained) {
> > > > +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i +
> > > > 1) + 1;
> > > > +                reg = lower_32_bits(reg);
> > > > +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> > > > +                if (reg)
> > > > +                    continue;
> > > > +                /* mark an overflow on high counter */
> > > > +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> > > > +            } else {
> > > > +                /* mark an overflow */
> > > >                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> > > > +            }
> > > >          }
> > > >      }
> > > >  }
> > > 
> > > I think the whole function is a bit of a mess, and could be better
> > > structured to treat 64bit counters as a first class citizen.
> > > 
> > > I'm suggesting something along those lines, which tries to
> > > streamline things a bit and keep the flow uniform between the
> > > two word sizes. IMHO, it helps reasonning about it and gives
> > > scope to the ARMv8.5 full 64bit counters... It is of course
> > > completely untested.
> > 
> > Looks OK to me as well. One remark though, don't we need to test if the
> > n+1th reg is enabled before incrementing it?

Indeed - we don't want to indicate an overflow on a disabled counter.


> 
> Hmmm. I'm not sure. I think we should make sure that we don't flag
> a counter as being chained if the odd counter is disabled, rather
> than checking it here. As long as the odd counter is not chained
> *and* enabled, we shouldn't touch it.

Does this mean that we don't care if the low counter is enabled or not
when deciding if the pair is chained?

I would find the code easier to follow if we had an explicit 'is the
high counter enabled here' check (at the point of deciding where to
put the overflow).


> 
> Again, untested:
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index cf371f643ade..47366817cd2a 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -15,6 +15,7 @@
>  #include <kvm/arm_vgic.h>
> 
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> select_idx);
> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> select_idx);
> 
>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> 
> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu *vcpu,
> u64 val)
>  		 * For high counters of chained events we must recreate the
>  		 * perf event with the long (64bit) attribute set.
>  		 */
> +		kvm_pmu_update_pmc_chained(vcpu, i);
>  		if (kvm_pmu_pmc_is_chained(pmc) &&
>  		    kvm_pmu_idx_is_high_counter(i)) {
>  			kvm_pmu_create_perf_event(vcpu, i);
> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct kvm_vcpu
> *vcpu, u64 select_idx)
>  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
>  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> 
> -	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> +	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> +	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {

I.e. here we don't care what the state of enablement is for the low counter.

Also at present, this may break the following use-case

 - User creates and uses a pair of chained counters
 - User disables odd/high counter
 - User reads values of both counters
 - User rewrites CHAIN event to odd/high counter OR user re-enables just the even/low counter
 - User reads value of both counters <- this may now different to the last read

Thanks,

Andrew Murray

>  		/*
>  		 * During promotion from !chained to chained we must ensure
>  		 * the adjacent counter is stopped and its event destroyed
> 
> What do you think?
> 
>         M.
> -- 
> Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-06 15:21           ` Andrew Murray
@ 2019-12-06 15:35             ` Marc Zyngier
  -1 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-06 15:35 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Auger Eric, eric.auger.pro, linux-kernel, kvmarm, james.morse,
	suzuki.poulose, drjones

On 2019-12-06 15:21, Andrew Murray wrote:
> On Thu, Dec 05, 2019 at 02:52:26PM +0000, Marc Zyngier wrote:
>> On 2019-12-05 14:06, Auger Eric wrote:
>> > Hi Marc,
>> >
>> > On 12/5/19 10:43 AM, Marc Zyngier wrote:
>> > > Hi Eric,
>> > >
>> > > On 2019-12-04 20:44, Eric Auger wrote:
>> > > > At the moment a SW_INCR counter always overflows on 32-bit
>> > > > boundary, independently on whether the n+1th counter is
>> > > > programmed as CHAIN.
>> > > >
>> > > > Check whether the SW_INCR counter is a 64b counter and if so,
>> > > > implement the 64b logic.
>> > > >
>> > > > Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU
>> > > > counters")
>> > > > Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> > > > ---
>> > > >  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>> > > >  1 file changed, 15 insertions(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> > > > index c3f8b059881e..7ab477db2f75 100644
>> > > > --- a/virt/kvm/arm/pmu.c
>> > > > +++ b/virt/kvm/arm/pmu.c
>> > > > @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct 
>> kvm_vcpu
>> > > > *vcpu, u64 val)
>> > > >
>> > > >      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>> > > >      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>> > > > +        bool chained = test_bit(i >> 1, 
>> vcpu->arch.pmu.chained);
>> > > > +
>> > >
>> > > I'd rather you use kvm_pmu_pmc_is_chained() rather than 
>> open-coding
>> > > this. But see below:
>> > >
>> > > >          if (!(val & BIT(i)))
>> > > >              continue;
>> > > >          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>> > > > @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct
>> > > > kvm_vcpu
>> > > > *vcpu, u64 val)
>> > > >              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 
>> 1;
>> > > >              reg = lower_32_bits(reg);
>> > > >              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>> > > > -            if (!reg)
>> > > > +            if (reg) /* no overflow */
>> > > > +                continue;
>> > > > +            if (chained) {
>> > > > +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i 
>> +
>> > > > 1) + 1;
>> > > > +                reg = lower_32_bits(reg);
>> > > > +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = 
>> reg;
>> > > > +                if (reg)
>> > > > +                    continue;
>> > > > +                /* mark an overflow on high counter */
>> > > > +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 
>> 1);
>> > > > +            } else {
>> > > > +                /* mark an overflow */
>> > > >                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>> > > > +            }
>> > > >          }
>> > > >      }
>> > > >  }
>> > >
>> > > I think the whole function is a bit of a mess, and could be 
>> better
>> > > structured to treat 64bit counters as a first class citizen.
>> > >
>> > > I'm suggesting something along those lines, which tries to
>> > > streamline things a bit and keep the flow uniform between the
>> > > two word sizes. IMHO, it helps reasonning about it and gives
>> > > scope to the ARMv8.5 full 64bit counters... It is of course
>> > > completely untested.
>> >
>> > Looks OK to me as well. One remark though, don't we need to test 
>> if the
>> > n+1th reg is enabled before incrementing it?
>
> Indeed - we don't want to indicate an overflow on a disabled counter.
>
>
>>
>> Hmmm. I'm not sure. I think we should make sure that we don't flag
>> a counter as being chained if the odd counter is disabled, rather
>> than checking it here. As long as the odd counter is not chained
>> *and* enabled, we shouldn't touch it.
>
> Does this mean that we don't care if the low counter is enabled or 
> not
> when deciding if the pair is chained?
>
> I would find the code easier to follow if we had an explicit 'is the
> high counter enabled here' check (at the point of deciding where to
> put the overflow).

Sure. But the point is that we're spreading that kind of checks all 
over
the map, and that we don't have a way to even reason about the state of
a 64bit counter. Doesn't it strike you as being mildly broken?

>
>
>>
>> Again, untested:
>>
>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> index cf371f643ade..47366817cd2a 100644
>> --- a/virt/kvm/arm/pmu.c
>> +++ b/virt/kvm/arm/pmu.c
>> @@ -15,6 +15,7 @@
>>  #include <kvm/arm_vgic.h>
>>
>>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
>> select_idx);
>> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
>> select_idx);
>>
>>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>>
>> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu 
>> *vcpu,
>> u64 val)
>>  		 * For high counters of chained events we must recreate the
>>  		 * perf event with the long (64bit) attribute set.
>>  		 */
>> +		kvm_pmu_update_pmc_chained(vcpu, i);
>>  		if (kvm_pmu_pmc_is_chained(pmc) &&
>>  		    kvm_pmu_idx_is_high_counter(i)) {
>>  			kvm_pmu_create_perf_event(vcpu, i);
>> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct 
>> kvm_vcpu
>> *vcpu, u64 select_idx)
>>  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
>>  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>>
>> -	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
>> +	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
>> +	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
>
> I.e. here we don't care what the state of enablement is for the low 
> counter.
>
> Also at present, this may break the following use-case
>
>  - User creates and uses a pair of chained counters
>  - User disables odd/high counter
>  - User reads values of both counters
>  - User rewrites CHAIN event to odd/high counter OR user re-enables
> just the even/low counter
>  - User reads value of both counters <- this may now different to the
> last read

Hey, I didn't say it was perfect ;-). But for sure we can't let the
PMU bitrot more than it already has, and I'm not sure this is heading
the right way.

I'm certainly going to push back on new PMU features until we can 
properly
reason about 64bit counters as a top-level entity (as opposed to a 
bunch
of discrete counters).

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-06 15:35             ` Marc Zyngier
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2019-12-06 15:35 UTC (permalink / raw)
  To: Andrew Murray; +Cc: linux-kernel, kvmarm, eric.auger.pro

On 2019-12-06 15:21, Andrew Murray wrote:
> On Thu, Dec 05, 2019 at 02:52:26PM +0000, Marc Zyngier wrote:
>> On 2019-12-05 14:06, Auger Eric wrote:
>> > Hi Marc,
>> >
>> > On 12/5/19 10:43 AM, Marc Zyngier wrote:
>> > > Hi Eric,
>> > >
>> > > On 2019-12-04 20:44, Eric Auger wrote:
>> > > > At the moment a SW_INCR counter always overflows on 32-bit
>> > > > boundary, independently on whether the n+1th counter is
>> > > > programmed as CHAIN.
>> > > >
>> > > > Check whether the SW_INCR counter is a 64b counter and if so,
>> > > > implement the 64b logic.
>> > > >
>> > > > Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU
>> > > > counters")
>> > > > Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> > > > ---
>> > > >  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>> > > >  1 file changed, 15 insertions(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> > > > index c3f8b059881e..7ab477db2f75 100644
>> > > > --- a/virt/kvm/arm/pmu.c
>> > > > +++ b/virt/kvm/arm/pmu.c
>> > > > @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct 
>> kvm_vcpu
>> > > > *vcpu, u64 val)
>> > > >
>> > > >      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>> > > >      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>> > > > +        bool chained = test_bit(i >> 1, 
>> vcpu->arch.pmu.chained);
>> > > > +
>> > >
>> > > I'd rather you use kvm_pmu_pmc_is_chained() rather than 
>> open-coding
>> > > this. But see below:
>> > >
>> > > >          if (!(val & BIT(i)))
>> > > >              continue;
>> > > >          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>> > > > @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct
>> > > > kvm_vcpu
>> > > > *vcpu, u64 val)
>> > > >              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 
>> 1;
>> > > >              reg = lower_32_bits(reg);
>> > > >              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>> > > > -            if (!reg)
>> > > > +            if (reg) /* no overflow */
>> > > > +                continue;
>> > > > +            if (chained) {
>> > > > +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i 
>> +
>> > > > 1) + 1;
>> > > > +                reg = lower_32_bits(reg);
>> > > > +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = 
>> reg;
>> > > > +                if (reg)
>> > > > +                    continue;
>> > > > +                /* mark an overflow on high counter */
>> > > > +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 
>> 1);
>> > > > +            } else {
>> > > > +                /* mark an overflow */
>> > > >                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>> > > > +            }
>> > > >          }
>> > > >      }
>> > > >  }
>> > >
>> > > I think the whole function is a bit of a mess, and could be 
>> better
>> > > structured to treat 64bit counters as a first class citizen.
>> > >
>> > > I'm suggesting something along those lines, which tries to
>> > > streamline things a bit and keep the flow uniform between the
>> > > two word sizes. IMHO, it helps reasonning about it and gives
>> > > scope to the ARMv8.5 full 64bit counters... It is of course
>> > > completely untested.
>> >
>> > Looks OK to me as well. One remark though, don't we need to test 
>> if the
>> > n+1th reg is enabled before incrementing it?
>
> Indeed - we don't want to indicate an overflow on a disabled counter.
>
>
>>
>> Hmmm. I'm not sure. I think we should make sure that we don't flag
>> a counter as being chained if the odd counter is disabled, rather
>> than checking it here. As long as the odd counter is not chained
>> *and* enabled, we shouldn't touch it.
>
> Does this mean that we don't care if the low counter is enabled or 
> not
> when deciding if the pair is chained?
>
> I would find the code easier to follow if we had an explicit 'is the
> high counter enabled here' check (at the point of deciding where to
> put the overflow).

Sure. But the point is that we're spreading that kind of checks all 
over
the map, and that we don't have a way to even reason about the state of
a 64bit counter. Doesn't it strike you as being mildly broken?

>
>
>>
>> Again, untested:
>>
>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> index cf371f643ade..47366817cd2a 100644
>> --- a/virt/kvm/arm/pmu.c
>> +++ b/virt/kvm/arm/pmu.c
>> @@ -15,6 +15,7 @@
>>  #include <kvm/arm_vgic.h>
>>
>>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
>> select_idx);
>> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
>> select_idx);
>>
>>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>>
>> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu 
>> *vcpu,
>> u64 val)
>>  		 * For high counters of chained events we must recreate the
>>  		 * perf event with the long (64bit) attribute set.
>>  		 */
>> +		kvm_pmu_update_pmc_chained(vcpu, i);
>>  		if (kvm_pmu_pmc_is_chained(pmc) &&
>>  		    kvm_pmu_idx_is_high_counter(i)) {
>>  			kvm_pmu_create_perf_event(vcpu, i);
>> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct 
>> kvm_vcpu
>> *vcpu, u64 select_idx)
>>  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
>>  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>>
>> -	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
>> +	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
>> +	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
>
> I.e. here we don't care what the state of enablement is for the low 
> counter.
>
> Also at present, this may break the following use-case
>
>  - User creates and uses a pair of chained counters
>  - User disables odd/high counter
>  - User reads values of both counters
>  - User rewrites CHAIN event to odd/high counter OR user re-enables
> just the even/low counter
>  - User reads value of both counters <- this may now different to the
> last read

Hey, I didn't say it was perfect ;-). But for sure we can't let the
PMU bitrot more than it already has, and I'm not sure this is heading
the right way.

I'm certainly going to push back on new PMU features until we can 
properly
reason about 64bit counters as a top-level entity (as opposed to a 
bunch
of discrete counters).

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-05 19:01           ` Auger Eric
@ 2019-12-06 15:48             ` Andrew Murray
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 15:48 UTC (permalink / raw)
  To: Auger Eric
  Cc: Marc Zyngier, eric.auger.pro, linux-kernel, kvmarm, james.morse,
	suzuki.poulose, drjones

On Thu, Dec 05, 2019 at 08:01:42PM +0100, Auger Eric wrote:
> Hi Marc,
> 
> On 12/5/19 3:52 PM, Marc Zyngier wrote:
> > On 2019-12-05 14:06, Auger Eric wrote:
> >> Hi Marc,
> >>
> >> On 12/5/19 10:43 AM, Marc Zyngier wrote:
> >>> Hi Eric,
> >>>
> >>> On 2019-12-04 20:44, Eric Auger wrote:
> >>>> At the moment a SW_INCR counter always overflows on 32-bit
> >>>> boundary, independently on whether the n+1th counter is
> >>>> programmed as CHAIN.
> >>>>
> >>>> Check whether the SW_INCR counter is a 64b counter and if so,
> >>>> implement the 64b logic.
> >>>>
> >>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
> >>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>> ---
> >>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
> >>>>  1 file changed, 15 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> >>>> index c3f8b059881e..7ab477db2f75 100644
> >>>> --- a/virt/kvm/arm/pmu.c
> >>>> +++ b/virt/kvm/arm/pmu.c
> >>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>
> >>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> >>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> >>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> >>>> +
> >>>
> >>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> >>> this. But see below:
> >>>
> >>>>          if (!(val & BIT(i)))
> >>>>              continue;
> >>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> >>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> >>>>              reg = lower_32_bits(reg);
> >>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> >>>> -            if (!reg)
> >>>> +            if (reg) /* no overflow */
> >>>> +                continue;
> >>>> +            if (chained) {
> >>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
> >>>> +                reg = lower_32_bits(reg);
> >>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> >>>> +                if (reg)
> >>>> +                    continue;
> >>>> +                /* mark an overflow on high counter */
> >>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> >>>> +            } else {
> >>>> +                /* mark an overflow */
> >>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> >>>> +            }
> >>>>          }
> >>>>      }
> >>>>  }
> >>>
> >>> I think the whole function is a bit of a mess, and could be better
> >>> structured to treat 64bit counters as a first class citizen.
> >>>
> >>> I'm suggesting something along those lines, which tries to
> >>> streamline things a bit and keep the flow uniform between the
> >>> two word sizes. IMHO, it helps reasonning about it and gives
> >>> scope to the ARMv8.5 full 64bit counters... It is of course
> >>> completely untested.
> >>
> >> Looks OK to me as well. One remark though, don't we need to test if the
> >> n+1th reg is enabled before incrementing it?
> > 
> > Hmmm. I'm not sure. I think we should make sure that we don't flag
> > a counter as being chained if the odd counter is disabled, rather
> > than checking it here. As long as the odd counter is not chained
> > *and* enabled, we shouldn't touch it.>
> > Again, untested:
> > 
> > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > index cf371f643ade..47366817cd2a 100644
> > --- a/virt/kvm/arm/pmu.c
> > +++ b/virt/kvm/arm/pmu.c
> > @@ -15,6 +15,7 @@
> >  #include <kvm/arm_vgic.h>
> > 
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > 
> >  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> > 
> > @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
> > *vcpu, u64 val)
> >           * For high counters of chained events we must recreate the
> >           * perf event with the long (64bit) attribute set.
> >           */
> > +        kvm_pmu_update_pmc_chained(vcpu, i);
> >          if (kvm_pmu_pmc_is_chained(pmc) &&
> >              kvm_pmu_idx_is_high_counter(i)) {
> >              kvm_pmu_create_perf_event(vcpu, i);
> > @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> > kvm_vcpu *vcpu, u64 select_idx)
> >      struct kvm_pmu *pmu = &vcpu->arch.pmu;
> >      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> > 
> > -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> > +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> > +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
> 
> In create_perf_event(), has_chain_evtype() is used and a 64b sample
> period would be chosen even if the counters are disjoined (since the odd
> is disabled). We would need to use pmc_is_chained() instead.

So in this use-case, someone has configured a pair of chained counters
but only enabled the lower half. In this case we only create a 32bit backing
event (no PERF_ATTR_CFG1_KVM_PMU_CHAINED flag) - I guess this means the
perf event will trigger on 64bit period(?) despite the high counter being
disabled. The guest will see an interrupt in their disabled high counter.

This is a know limitation - see the comment "For
chained counters we only support overflow interrupts on the high counter"

Though upon looking at this it seems a little more broken. I guess where
both counters are enabled we want to overflow at 64bits and raise the
overflow to the high counter. When the high counter is disabled we want to
overflow on 32bits and raise the overflow to the low counter.

Perhaps something like the following would help:

--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -585,15 +585,16 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 
        counter = kvm_pmu_get_pair_counter_value(vcpu, pmc);
 
-       if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
+       if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
+           kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
+
                /**
                 * The initial sample period (overflow count) of an event. For
                 * chained counters we only support overflow interrupts on the
                 * high counter.
                 */
                attr.sample_period = (-counter) & GENMASK(63, 0);
-               if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
-                       attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
+               attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
 
                event = perf_event_create_kernel_counter(&attr, -1, current,
                                                         kvm_pmu_perf_overflow,


It's not clear to me what is supposed to happen with overflow handling on the
low counter on chained counters (where the high counter is disabled).


> 
> With perf_events, the check of whether the odd register is enabled is
> properly done (create_perf_event). Then I understand whenever there is a
> change in enable state or type we delete the previous perf event and
> re-create a new one. Enable state check just is missing for SW_INCR.
> 
> Some other questions:
> - do we need a perf event to be created even if the counter is not
> enabled? For instance on counter resets, create_perf_events get called.

That would suggest we create and destroy them each time the guest enables
and disables the counters - I would expect them to do that a lot (every
context switch) - my assumption would be that the current approach has
less overhead for a running guest.


> - also actions are made for counters which are not implemented. loop
> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
> bitmask of supported counters stored before pmu readiness?
> I can propose such changes if you think they are valuable.

Are they? Many of the calls into this file come from
arch/arm64/kvm/sys_regs.c where we apply a mask (value from
kvm_pmu_valid_counter_mask) to ignore unsupported counters.

Thanks,

Andrew Murray


> 
> Thanks
> 
> Eric
> 
> >          /*
> >           * During promotion from !chained to chained we must ensure
> >           * the adjacent counter is stopped and its event destroyed
> > 
> > What do you think?
> > 
> >         M.
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-06 15:48             ` Andrew Murray
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 15:48 UTC (permalink / raw)
  To: Auger Eric; +Cc: Marc Zyngier, linux-kernel, kvmarm, eric.auger.pro

On Thu, Dec 05, 2019 at 08:01:42PM +0100, Auger Eric wrote:
> Hi Marc,
> 
> On 12/5/19 3:52 PM, Marc Zyngier wrote:
> > On 2019-12-05 14:06, Auger Eric wrote:
> >> Hi Marc,
> >>
> >> On 12/5/19 10:43 AM, Marc Zyngier wrote:
> >>> Hi Eric,
> >>>
> >>> On 2019-12-04 20:44, Eric Auger wrote:
> >>>> At the moment a SW_INCR counter always overflows on 32-bit
> >>>> boundary, independently on whether the n+1th counter is
> >>>> programmed as CHAIN.
> >>>>
> >>>> Check whether the SW_INCR counter is a 64b counter and if so,
> >>>> implement the 64b logic.
> >>>>
> >>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
> >>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>> ---
> >>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
> >>>>  1 file changed, 15 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> >>>> index c3f8b059881e..7ab477db2f75 100644
> >>>> --- a/virt/kvm/arm/pmu.c
> >>>> +++ b/virt/kvm/arm/pmu.c
> >>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>
> >>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> >>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> >>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> >>>> +
> >>>
> >>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> >>> this. But see below:
> >>>
> >>>>          if (!(val & BIT(i)))
> >>>>              continue;
> >>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> >>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> >>>>              reg = lower_32_bits(reg);
> >>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> >>>> -            if (!reg)
> >>>> +            if (reg) /* no overflow */
> >>>> +                continue;
> >>>> +            if (chained) {
> >>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
> >>>> +                reg = lower_32_bits(reg);
> >>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> >>>> +                if (reg)
> >>>> +                    continue;
> >>>> +                /* mark an overflow on high counter */
> >>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> >>>> +            } else {
> >>>> +                /* mark an overflow */
> >>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> >>>> +            }
> >>>>          }
> >>>>      }
> >>>>  }
> >>>
> >>> I think the whole function is a bit of a mess, and could be better
> >>> structured to treat 64bit counters as a first class citizen.
> >>>
> >>> I'm suggesting something along those lines, which tries to
> >>> streamline things a bit and keep the flow uniform between the
> >>> two word sizes. IMHO, it helps reasonning about it and gives
> >>> scope to the ARMv8.5 full 64bit counters... It is of course
> >>> completely untested.
> >>
> >> Looks OK to me as well. One remark though, don't we need to test if the
> >> n+1th reg is enabled before incrementing it?
> > 
> > Hmmm. I'm not sure. I think we should make sure that we don't flag
> > a counter as being chained if the odd counter is disabled, rather
> > than checking it here. As long as the odd counter is not chained
> > *and* enabled, we shouldn't touch it.>
> > Again, untested:
> > 
> > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > index cf371f643ade..47366817cd2a 100644
> > --- a/virt/kvm/arm/pmu.c
> > +++ b/virt/kvm/arm/pmu.c
> > @@ -15,6 +15,7 @@
> >  #include <kvm/arm_vgic.h>
> > 
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > 
> >  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> > 
> > @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
> > *vcpu, u64 val)
> >           * For high counters of chained events we must recreate the
> >           * perf event with the long (64bit) attribute set.
> >           */
> > +        kvm_pmu_update_pmc_chained(vcpu, i);
> >          if (kvm_pmu_pmc_is_chained(pmc) &&
> >              kvm_pmu_idx_is_high_counter(i)) {
> >              kvm_pmu_create_perf_event(vcpu, i);
> > @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> > kvm_vcpu *vcpu, u64 select_idx)
> >      struct kvm_pmu *pmu = &vcpu->arch.pmu;
> >      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> > 
> > -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> > +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> > +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
> 
> In create_perf_event(), has_chain_evtype() is used and a 64b sample
> period would be chosen even if the counters are disjoined (since the odd
> is disabled). We would need to use pmc_is_chained() instead.

So in this use-case, someone has configured a pair of chained counters
but only enabled the lower half. In this case we only create a 32bit backing
event (no PERF_ATTR_CFG1_KVM_PMU_CHAINED flag) - I guess this means the
perf event will trigger on 64bit period(?) despite the high counter being
disabled. The guest will see an interrupt in their disabled high counter.

This is a know limitation - see the comment "For
chained counters we only support overflow interrupts on the high counter"

Though upon looking at this it seems a little more broken. I guess where
both counters are enabled we want to overflow at 64bits and raise the
overflow to the high counter. When the high counter is disabled we want to
overflow on 32bits and raise the overflow to the low counter.

Perhaps something like the following would help:

--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -585,15 +585,16 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 
        counter = kvm_pmu_get_pair_counter_value(vcpu, pmc);
 
-       if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
+       if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
+           kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
+
                /**
                 * The initial sample period (overflow count) of an event. For
                 * chained counters we only support overflow interrupts on the
                 * high counter.
                 */
                attr.sample_period = (-counter) & GENMASK(63, 0);
-               if (kvm_pmu_counter_is_enabled(vcpu, pmc->idx + 1))
-                       attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
+               attr.config1 |= PERF_ATTR_CFG1_KVM_PMU_CHAINED;
 
                event = perf_event_create_kernel_counter(&attr, -1, current,
                                                         kvm_pmu_perf_overflow,


It's not clear to me what is supposed to happen with overflow handling on the
low counter on chained counters (where the high counter is disabled).


> 
> With perf_events, the check of whether the odd register is enabled is
> properly done (create_perf_event). Then I understand whenever there is a
> change in enable state or type we delete the previous perf event and
> re-create a new one. Enable state check just is missing for SW_INCR.
> 
> Some other questions:
> - do we need a perf event to be created even if the counter is not
> enabled? For instance on counter resets, create_perf_events get called.

That would suggest we create and destroy them each time the guest enables
and disables the counters - I would expect them to do that a lot (every
context switch) - my assumption would be that the current approach has
less overhead for a running guest.


> - also actions are made for counters which are not implemented. loop
> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
> bitmask of supported counters stored before pmu readiness?
> I can propose such changes if you think they are valuable.

Are they? Many of the calls into this file come from
arch/arm64/kvm/sys_regs.c where we apply a mask (value from
kvm_pmu_valid_counter_mask) to ignore unsupported counters.

Thanks,

Andrew Murray


> 
> Thanks
> 
> Eric
> 
> >          /*
> >           * During promotion from !chained to chained we must ensure
> >           * the adjacent counter is stopped and its event destroyed
> > 
> > What do you think?
> > 
> >         M.
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-06 15:35             ` Marc Zyngier
@ 2019-12-06 16:02               ` Andrew Murray
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 16:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Auger Eric, eric.auger.pro, linux-kernel, kvmarm, james.morse,
	suzuki.poulose, drjones

On Fri, Dec 06, 2019 at 03:35:03PM +0000, Marc Zyngier wrote:
> On 2019-12-06 15:21, Andrew Murray wrote:
> > On Thu, Dec 05, 2019 at 02:52:26PM +0000, Marc Zyngier wrote:
> > > On 2019-12-05 14:06, Auger Eric wrote:
> > > > Hi Marc,


> > > > >
> > > > > I think the whole function is a bit of a mess, and could be
> > > better
> > > > > structured to treat 64bit counters as a first class citizen.
> > > > >
> > > > > I'm suggesting something along those lines, which tries to
> > > > > streamline things a bit and keep the flow uniform between the
> > > > > two word sizes. IMHO, it helps reasonning about it and gives
> > > > > scope to the ARMv8.5 full 64bit counters... It is of course
> > > > > completely untested.
> > > >
> > > > Looks OK to me as well. One remark though, don't we need to test
> > > if the
> > > > n+1th reg is enabled before incrementing it?
> > 
> > Indeed - we don't want to indicate an overflow on a disabled counter.
> > 
> > 
> > > 
> > > Hmmm. I'm not sure. I think we should make sure that we don't flag
> > > a counter as being chained if the odd counter is disabled, rather
> > > than checking it here. As long as the odd counter is not chained
> > > *and* enabled, we shouldn't touch it.
> > 
> > Does this mean that we don't care if the low counter is enabled or not
> > when deciding if the pair is chained?
> > 
> > I would find the code easier to follow if we had an explicit 'is the
> > high counter enabled here' check (at the point of deciding where to
> > put the overflow).
> 
> Sure. But the point is that we're spreading that kind of checks all over
> the map, and that we don't have a way to even reason about the state of
> a 64bit counter. Doesn't it strike you as being mildly broken?
> 

Yup! To the point where I can't trust the function names and have to look
at what the code does...


> > > @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> > > kvm_vcpu
> > > *vcpu, u64 select_idx)
> > >  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
> > >  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> > > 
> > > -	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> > > +	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> > > +	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
> > 
> > I.e. here we don't care what the state of enablement is for the low
> > counter.
> > 
> > Also at present, this may break the following use-case
> > 
> >  - User creates and uses a pair of chained counters
> >  - User disables odd/high counter
> >  - User reads values of both counters
> >  - User rewrites CHAIN event to odd/high counter OR user re-enables
> > just the even/low counter
> >  - User reads value of both counters <- this may now different to the
> > last read
> 
> Hey, I didn't say it was perfect ;-). But for sure we can't let the
> PMU bitrot more than it already has, and I'm not sure this is heading
> the right way.

I think we're aligned here. To me this code is becoming very fragile, difficult
for me to make sense of and is stretching the abstractions we've made. This is
why it is difficult to enhance it without breaking something. It's why I felt it
was safer to add 'an extra check' in the SWINCR than to risk touching something
that I didn't have the confidence I could be sure was correct. 


> 
> I'm certainly going to push back on new PMU features until we can properly
> reason about 64bit counters as a top-level entity (as opposed to a bunch
> of discrete counters).

Thanks,

Andrew Murray

> 
> Thanks,
> 
>         M.
> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2019-12-06 16:02               ` Andrew Murray
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Murray @ 2019-12-06 16:02 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, kvmarm, eric.auger.pro

On Fri, Dec 06, 2019 at 03:35:03PM +0000, Marc Zyngier wrote:
> On 2019-12-06 15:21, Andrew Murray wrote:
> > On Thu, Dec 05, 2019 at 02:52:26PM +0000, Marc Zyngier wrote:
> > > On 2019-12-05 14:06, Auger Eric wrote:
> > > > Hi Marc,


> > > > >
> > > > > I think the whole function is a bit of a mess, and could be
> > > better
> > > > > structured to treat 64bit counters as a first class citizen.
> > > > >
> > > > > I'm suggesting something along those lines, which tries to
> > > > > streamline things a bit and keep the flow uniform between the
> > > > > two word sizes. IMHO, it helps reasonning about it and gives
> > > > > scope to the ARMv8.5 full 64bit counters... It is of course
> > > > > completely untested.
> > > >
> > > > Looks OK to me as well. One remark though, don't we need to test
> > > if the
> > > > n+1th reg is enabled before incrementing it?
> > 
> > Indeed - we don't want to indicate an overflow on a disabled counter.
> > 
> > 
> > > 
> > > Hmmm. I'm not sure. I think we should make sure that we don't flag
> > > a counter as being chained if the odd counter is disabled, rather
> > > than checking it here. As long as the odd counter is not chained
> > > *and* enabled, we shouldn't touch it.
> > 
> > Does this mean that we don't care if the low counter is enabled or not
> > when deciding if the pair is chained?
> > 
> > I would find the code easier to follow if we had an explicit 'is the
> > high counter enabled here' check (at the point of deciding where to
> > put the overflow).
> 
> Sure. But the point is that we're spreading that kind of checks all over
> the map, and that we don't have a way to even reason about the state of
> a 64bit counter. Doesn't it strike you as being mildly broken?
> 

Yup! To the point where I can't trust the function names and have to look
at what the code does...


> > > @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> > > kvm_vcpu
> > > *vcpu, u64 select_idx)
> > >  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
> > >  	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> > > 
> > > -	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> > > +	if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> > > +	    kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {
> > 
> > I.e. here we don't care what the state of enablement is for the low
> > counter.
> > 
> > Also at present, this may break the following use-case
> > 
> >  - User creates and uses a pair of chained counters
> >  - User disables odd/high counter
> >  - User reads values of both counters
> >  - User rewrites CHAIN event to odd/high counter OR user re-enables
> > just the even/low counter
> >  - User reads value of both counters <- this may now different to the
> > last read
> 
> Hey, I didn't say it was perfect ;-). But for sure we can't let the
> PMU bitrot more than it already has, and I'm not sure this is heading
> the right way.

I think we're aligned here. To me this code is becoming very fragile, difficult
for me to make sense of and is stretching the abstractions we've made. This is
why it is difficult to enhance it without breaking something. It's why I felt it
was safer to add 'an extra check' in the SWINCR than to risk touching something
that I didn't have the confidence I could be sure was correct. 


> 
> I'm certainly going to push back on new PMU features until we can properly
> reason about 64bit counters as a top-level entity (as opposed to a bunch
> of discrete counters).

Thanks,

Andrew Murray

> 
> Thanks,
> 
>         M.
> -- 
> Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2019-12-05 19:01           ` Auger Eric
@ 2020-01-19 17:58             ` Marc Zyngier
  -1 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2020-01-19 17:58 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

On Thu, 5 Dec 2019 20:01:42 +0100
Auger Eric <eric.auger@redhat.com> wrote:

Hi Eric,

> Hi Marc,
> 
> On 12/5/19 3:52 PM, Marc Zyngier wrote:
> > On 2019-12-05 14:06, Auger Eric wrote:  
> >> Hi Marc,
> >>
> >> On 12/5/19 10:43 AM, Marc Zyngier wrote:  
> >>> Hi Eric,
> >>>
> >>> On 2019-12-04 20:44, Eric Auger wrote:  
> >>>> At the moment a SW_INCR counter always overflows on 32-bit
> >>>> boundary, independently on whether the n+1th counter is
> >>>> programmed as CHAIN.
> >>>>
> >>>> Check whether the SW_INCR counter is a 64b counter and if so,
> >>>> implement the 64b logic.
> >>>>
> >>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
> >>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>> ---
> >>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
> >>>>  1 file changed, 15 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> >>>> index c3f8b059881e..7ab477db2f75 100644
> >>>> --- a/virt/kvm/arm/pmu.c
> >>>> +++ b/virt/kvm/arm/pmu.c
> >>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>
> >>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> >>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> >>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> >>>> +  
> >>>
> >>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> >>> this. But see below:
> >>>  
> >>>>          if (!(val & BIT(i)))
> >>>>              continue;
> >>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> >>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> >>>>              reg = lower_32_bits(reg);
> >>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> >>>> -            if (!reg)
> >>>> +            if (reg) /* no overflow */
> >>>> +                continue;
> >>>> +            if (chained) {
> >>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
> >>>> +                reg = lower_32_bits(reg);
> >>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> >>>> +                if (reg)
> >>>> +                    continue;
> >>>> +                /* mark an overflow on high counter */
> >>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> >>>> +            } else {
> >>>> +                /* mark an overflow */
> >>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> >>>> +            }
> >>>>          }
> >>>>      }
> >>>>  }  
> >>>
> >>> I think the whole function is a bit of a mess, and could be better
> >>> structured to treat 64bit counters as a first class citizen.
> >>>
> >>> I'm suggesting something along those lines, which tries to
> >>> streamline things a bit and keep the flow uniform between the
> >>> two word sizes. IMHO, it helps reasonning about it and gives
> >>> scope to the ARMv8.5 full 64bit counters... It is of course
> >>> completely untested.  
> >>
> >> Looks OK to me as well. One remark though, don't we need to test if the
> >> n+1th reg is enabled before incrementing it?  
> > 
> > Hmmm. I'm not sure. I think we should make sure that we don't flag
> > a counter as being chained if the odd counter is disabled, rather
> > than checking it here. As long as the odd counter is not chained
> > *and* enabled, we shouldn't touch it.>
> > Again, untested:
> > 
> > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > index cf371f643ade..47366817cd2a 100644
> > --- a/virt/kvm/arm/pmu.c
> > +++ b/virt/kvm/arm/pmu.c
> > @@ -15,6 +15,7 @@
> >  #include <kvm/arm_vgic.h>
> > 
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > 
> >  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> > 
> > @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
> > *vcpu, u64 val)
> >           * For high counters of chained events we must recreate the
> >           * perf event with the long (64bit) attribute set.
> >           */
> > +        kvm_pmu_update_pmc_chained(vcpu, i);
> >          if (kvm_pmu_pmc_is_chained(pmc) &&
> >              kvm_pmu_idx_is_high_counter(i)) {
> >              kvm_pmu_create_perf_event(vcpu, i);
> > @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> > kvm_vcpu *vcpu, u64 select_idx)
> >      struct kvm_pmu *pmu = &vcpu->arch.pmu;
> >      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> > 
> > -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> > +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> > +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {  
> 
> In create_perf_event(), has_chain_evtype() is used and a 64b sample
> period would be chosen even if the counters are disjoined (since the odd
> is disabled). We would need to use pmc_is_chained() instead.
> 
> With perf_events, the check of whether the odd register is enabled is
> properly done (create_perf_event). Then I understand whenever there is a
> change in enable state or type we delete the previous perf event and
> re-create a new one. Enable state check just is missing for SW_INCR.

Can you please respin this? I'd like to have it queued quickly, if at
all possible.

> 
> Some other questions:
> - do we need a perf event to be created even if the counter is not
> enabled? For instance on counter resets, create_perf_events get called.

It shouldn't be necessary.

> - also actions are made for counters which are not implemented. loop
> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
> bitmask of supported counters stored before pmu readiness?
> I can propose such changes if you think they are valuable.

That would certainly be a performance optimization.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2020-01-19 17:58             ` Marc Zyngier
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Zyngier @ 2020-01-19 17:58 UTC (permalink / raw)
  To: Auger Eric; +Cc: linux-kernel, kvmarm, eric.auger.pro

On Thu, 5 Dec 2019 20:01:42 +0100
Auger Eric <eric.auger@redhat.com> wrote:

Hi Eric,

> Hi Marc,
> 
> On 12/5/19 3:52 PM, Marc Zyngier wrote:
> > On 2019-12-05 14:06, Auger Eric wrote:  
> >> Hi Marc,
> >>
> >> On 12/5/19 10:43 AM, Marc Zyngier wrote:  
> >>> Hi Eric,
> >>>
> >>> On 2019-12-04 20:44, Eric Auger wrote:  
> >>>> At the moment a SW_INCR counter always overflows on 32-bit
> >>>> boundary, independently on whether the n+1th counter is
> >>>> programmed as CHAIN.
> >>>>
> >>>> Check whether the SW_INCR counter is a 64b counter and if so,
> >>>> implement the 64b logic.
> >>>>
> >>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
> >>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>> ---
> >>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
> >>>>  1 file changed, 15 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> >>>> index c3f8b059881e..7ab477db2f75 100644
> >>>> --- a/virt/kvm/arm/pmu.c
> >>>> +++ b/virt/kvm/arm/pmu.c
> >>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>
> >>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
> >>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
> >>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
> >>>> +  
> >>>
> >>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
> >>> this. But see below:
> >>>  
> >>>>          if (!(val & BIT(i)))
> >>>>              continue;
> >>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
> >>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
> >>>> *vcpu, u64 val)
> >>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
> >>>>              reg = lower_32_bits(reg);
> >>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
> >>>> -            if (!reg)
> >>>> +            if (reg) /* no overflow */
> >>>> +                continue;
> >>>> +            if (chained) {
> >>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
> >>>> +                reg = lower_32_bits(reg);
> >>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
> >>>> +                if (reg)
> >>>> +                    continue;
> >>>> +                /* mark an overflow on high counter */
> >>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
> >>>> +            } else {
> >>>> +                /* mark an overflow */
> >>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
> >>>> +            }
> >>>>          }
> >>>>      }
> >>>>  }  
> >>>
> >>> I think the whole function is a bit of a mess, and could be better
> >>> structured to treat 64bit counters as a first class citizen.
> >>>
> >>> I'm suggesting something along those lines, which tries to
> >>> streamline things a bit and keep the flow uniform between the
> >>> two word sizes. IMHO, it helps reasonning about it and gives
> >>> scope to the ARMv8.5 full 64bit counters... It is of course
> >>> completely untested.  
> >>
> >> Looks OK to me as well. One remark though, don't we need to test if the
> >> n+1th reg is enabled before incrementing it?  
> > 
> > Hmmm. I'm not sure. I think we should make sure that we don't flag
> > a counter as being chained if the odd counter is disabled, rather
> > than checking it here. As long as the odd counter is not chained
> > *and* enabled, we shouldn't touch it.>
> > Again, untested:
> > 
> > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> > index cf371f643ade..47366817cd2a 100644
> > --- a/virt/kvm/arm/pmu.c
> > +++ b/virt/kvm/arm/pmu.c
> > @@ -15,6 +15,7 @@
> >  #include <kvm/arm_vgic.h>
> > 
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
> > select_idx);
> > 
> >  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
> > 
> > @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
> > *vcpu, u64 val)
> >           * For high counters of chained events we must recreate the
> >           * perf event with the long (64bit) attribute set.
> >           */
> > +        kvm_pmu_update_pmc_chained(vcpu, i);
> >          if (kvm_pmu_pmc_is_chained(pmc) &&
> >              kvm_pmu_idx_is_high_counter(i)) {
> >              kvm_pmu_create_perf_event(vcpu, i);
> > @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
> > kvm_vcpu *vcpu, u64 select_idx)
> >      struct kvm_pmu *pmu = &vcpu->arch.pmu;
> >      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
> > 
> > -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
> > +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
> > +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {  
> 
> In create_perf_event(), has_chain_evtype() is used and a 64b sample
> period would be chosen even if the counters are disjoined (since the odd
> is disabled). We would need to use pmc_is_chained() instead.
> 
> With perf_events, the check of whether the odd register is enabled is
> properly done (create_perf_event). Then I understand whenever there is a
> change in enable state or type we delete the previous perf event and
> re-create a new one. Enable state check just is missing for SW_INCR.

Can you please respin this? I'd like to have it queued quickly, if at
all possible.

> 
> Some other questions:
> - do we need a perf event to be created even if the counter is not
> enabled? For instance on counter resets, create_perf_events get called.

It shouldn't be necessary.

> - also actions are made for counters which are not implemented. loop
> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
> bitmask of supported counters stored before pmu readiness?
> I can propose such changes if you think they are valuable.

That would certainly be a performance optimization.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
  2020-01-19 17:58             ` Marc Zyngier
@ 2020-01-20 13:30               ` Auger Eric
  -1 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2020-01-20 13:30 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvmarm, james.morse, andrew.murray,
	suzuki.poulose, drjones

Hi Marc,
On 1/19/20 6:58 PM, Marc Zyngier wrote:
> On Thu, 5 Dec 2019 20:01:42 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
> Hi Eric,
> 
>> Hi Marc,
>>
>> On 12/5/19 3:52 PM, Marc Zyngier wrote:
>>> On 2019-12-05 14:06, Auger Eric wrote:  
>>>> Hi Marc,
>>>>
>>>> On 12/5/19 10:43 AM, Marc Zyngier wrote:  
>>>>> Hi Eric,
>>>>>
>>>>> On 2019-12-04 20:44, Eric Auger wrote:  
>>>>>> At the moment a SW_INCR counter always overflows on 32-bit
>>>>>> boundary, independently on whether the n+1th counter is
>>>>>> programmed as CHAIN.
>>>>>>
>>>>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>>>>> implement the 64b logic.
>>>>>>
>>>>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> ---
>>>>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>>>> index c3f8b059881e..7ab477db2f75 100644
>>>>>> --- a/virt/kvm/arm/pmu.c
>>>>>> +++ b/virt/kvm/arm/pmu.c
>>>>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>>> *vcpu, u64 val)
>>>>>>
>>>>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>>>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>>>>> +  
>>>>>
>>>>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>>>>> this. But see below:
>>>>>  
>>>>>>          if (!(val & BIT(i)))
>>>>>>              continue;
>>>>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>>>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>>> *vcpu, u64 val)
>>>>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>>>>              reg = lower_32_bits(reg);
>>>>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>>>>> -            if (!reg)
>>>>>> +            if (reg) /* no overflow */
>>>>>> +                continue;
>>>>>> +            if (chained) {
>>>>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>>>>>> +                reg = lower_32_bits(reg);
>>>>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>>>>> +                if (reg)
>>>>>> +                    continue;
>>>>>> +                /* mark an overflow on high counter */
>>>>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>>>>> +            } else {
>>>>>> +                /* mark an overflow */
>>>>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>>>>> +            }
>>>>>>          }
>>>>>>      }
>>>>>>  }  
>>>>>
>>>>> I think the whole function is a bit of a mess, and could be better
>>>>> structured to treat 64bit counters as a first class citizen.
>>>>>
>>>>> I'm suggesting something along those lines, which tries to
>>>>> streamline things a bit and keep the flow uniform between the
>>>>> two word sizes. IMHO, it helps reasonning about it and gives
>>>>> scope to the ARMv8.5 full 64bit counters... It is of course
>>>>> completely untested.  
>>>>
>>>> Looks OK to me as well. One remark though, don't we need to test if the
>>>> n+1th reg is enabled before incrementing it?  
>>>
>>> Hmmm. I'm not sure. I think we should make sure that we don't flag
>>> a counter as being chained if the odd counter is disabled, rather
>>> than checking it here. As long as the odd counter is not chained
>>> *and* enabled, we shouldn't touch it.>
>>> Again, untested:
>>>
>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>> index cf371f643ade..47366817cd2a 100644
>>> --- a/virt/kvm/arm/pmu.c
>>> +++ b/virt/kvm/arm/pmu.c
>>> @@ -15,6 +15,7 @@
>>>  #include <kvm/arm_vgic.h>
>>>
>>>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
>>> select_idx);
>>> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
>>> select_idx);
>>>
>>>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>>>
>>> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
>>> *vcpu, u64 val)
>>>           * For high counters of chained events we must recreate the
>>>           * perf event with the long (64bit) attribute set.
>>>           */
>>> +        kvm_pmu_update_pmc_chained(vcpu, i);
>>>          if (kvm_pmu_pmc_is_chained(pmc) &&
>>>              kvm_pmu_idx_is_high_counter(i)) {
>>>              kvm_pmu_create_perf_event(vcpu, i);
>>> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
>>> kvm_vcpu *vcpu, u64 select_idx)
>>>      struct kvm_pmu *pmu = &vcpu->arch.pmu;
>>>      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>>>
>>> -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
>>> +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
>>> +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {  
>>
>> In create_perf_event(), has_chain_evtype() is used and a 64b sample
>> period would be chosen even if the counters are disjoined (since the odd
>> is disabled). We would need to use pmc_is_chained() instead.
>>
>> With perf_events, the check of whether the odd register is enabled is
>> properly done (create_perf_event). Then I understand whenever there is a
>> change in enable state or type we delete the previous perf event and
>> re-create a new one. Enable state check just is missing for SW_INCR.
> 
> Can you please respin this? I'd like to have it queued quickly, if at
> all possible.

Yes I am going to respin quickly.

Thanks

Eric
> 
>>
>> Some other questions:
>> - do we need a perf event to be created even if the counter is not
>> enabled? For instance on counter resets, create_perf_events get called.
> 
> It shouldn't be necessary.
> 
>> - also actions are made for counters which are not implemented. loop
>> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
>> bitmask of supported counters stored before pmu readiness?
>> I can propose such changes if you think they are valuable.
> 
> That would certainly be a performance optimization.
> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters
@ 2020-01-20 13:30               ` Auger Eric
  0 siblings, 0 replies; 38+ messages in thread
From: Auger Eric @ 2020-01-20 13:30 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, kvmarm, eric.auger.pro

Hi Marc,
On 1/19/20 6:58 PM, Marc Zyngier wrote:
> On Thu, 5 Dec 2019 20:01:42 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
> Hi Eric,
> 
>> Hi Marc,
>>
>> On 12/5/19 3:52 PM, Marc Zyngier wrote:
>>> On 2019-12-05 14:06, Auger Eric wrote:  
>>>> Hi Marc,
>>>>
>>>> On 12/5/19 10:43 AM, Marc Zyngier wrote:  
>>>>> Hi Eric,
>>>>>
>>>>> On 2019-12-04 20:44, Eric Auger wrote:  
>>>>>> At the moment a SW_INCR counter always overflows on 32-bit
>>>>>> boundary, independently on whether the n+1th counter is
>>>>>> programmed as CHAIN.
>>>>>>
>>>>>> Check whether the SW_INCR counter is a 64b counter and if so,
>>>>>> implement the 64b logic.
>>>>>>
>>>>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters")
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> ---
>>>>>>  virt/kvm/arm/pmu.c | 16 +++++++++++++++-
>>>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>>>> index c3f8b059881e..7ab477db2f75 100644
>>>>>> --- a/virt/kvm/arm/pmu.c
>>>>>> +++ b/virt/kvm/arm/pmu.c
>>>>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>>> *vcpu, u64 val)
>>>>>>
>>>>>>      enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>>>>>>      for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) {
>>>>>> +        bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained);
>>>>>> +  
>>>>>
>>>>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding
>>>>> this. But see below:
>>>>>  
>>>>>>          if (!(val & BIT(i)))
>>>>>>              continue;
>>>>>>          type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
>>>>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu
>>>>>> *vcpu, u64 val)
>>>>>>              reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
>>>>>>              reg = lower_32_bits(reg);
>>>>>>              __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
>>>>>> -            if (!reg)
>>>>>> +            if (reg) /* no overflow */
>>>>>> +                continue;
>>>>>> +            if (chained) {
>>>>>> +                reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1;
>>>>>> +                reg = lower_32_bits(reg);
>>>>>> +                __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg;
>>>>>> +                if (reg)
>>>>>> +                    continue;
>>>>>> +                /* mark an overflow on high counter */
>>>>>> +                __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1);
>>>>>> +            } else {
>>>>>> +                /* mark an overflow */
>>>>>>                  __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i);
>>>>>> +            }
>>>>>>          }
>>>>>>      }
>>>>>>  }  
>>>>>
>>>>> I think the whole function is a bit of a mess, and could be better
>>>>> structured to treat 64bit counters as a first class citizen.
>>>>>
>>>>> I'm suggesting something along those lines, which tries to
>>>>> streamline things a bit and keep the flow uniform between the
>>>>> two word sizes. IMHO, it helps reasonning about it and gives
>>>>> scope to the ARMv8.5 full 64bit counters... It is of course
>>>>> completely untested.  
>>>>
>>>> Looks OK to me as well. One remark though, don't we need to test if the
>>>> n+1th reg is enabled before incrementing it?  
>>>
>>> Hmmm. I'm not sure. I think we should make sure that we don't flag
>>> a counter as being chained if the odd counter is disabled, rather
>>> than checking it here. As long as the odd counter is not chained
>>> *and* enabled, we shouldn't touch it.>
>>> Again, untested:
>>>
>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>> index cf371f643ade..47366817cd2a 100644
>>> --- a/virt/kvm/arm/pmu.c
>>> +++ b/virt/kvm/arm/pmu.c
>>> @@ -15,6 +15,7 @@
>>>  #include <kvm/arm_vgic.h>
>>>
>>>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64
>>> select_idx);
>>> +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64
>>> select_idx);
>>>
>>>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>>>
>>> @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu
>>> *vcpu, u64 val)
>>>           * For high counters of chained events we must recreate the
>>>           * perf event with the long (64bit) attribute set.
>>>           */
>>> +        kvm_pmu_update_pmc_chained(vcpu, i);
>>>          if (kvm_pmu_pmc_is_chained(pmc) &&
>>>              kvm_pmu_idx_is_high_counter(i)) {
>>>              kvm_pmu_create_perf_event(vcpu, i);
>>> @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct
>>> kvm_vcpu *vcpu, u64 select_idx)
>>>      struct kvm_pmu *pmu = &vcpu->arch.pmu;
>>>      struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>>>
>>> -    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) {
>>> +    if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) &&
>>> +        kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) {  
>>
>> In create_perf_event(), has_chain_evtype() is used and a 64b sample
>> period would be chosen even if the counters are disjoined (since the odd
>> is disabled). We would need to use pmc_is_chained() instead.
>>
>> With perf_events, the check of whether the odd register is enabled is
>> properly done (create_perf_event). Then I understand whenever there is a
>> change in enable state or type we delete the previous perf event and
>> re-create a new one. Enable state check just is missing for SW_INCR.
> 
> Can you please respin this? I'd like to have it queued quickly, if at
> all possible.

Yes I am going to respin quickly.

Thanks

Eric
> 
>>
>> Some other questions:
>> - do we need a perf event to be created even if the counter is not
>> enabled? For instance on counter resets, create_perf_events get called.
> 
> It shouldn't be necessary.
> 
>> - also actions are made for counters which are not implemented. loop
>> until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a
>> bitmask of supported counters stored before pmu readiness?
>> I can propose such changes if you think they are valuable.
> 
> That would certainly be a performance optimization.
> 
> Thanks,
> 
> 	M.
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2020-01-20 13:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-04 20:44 [RFC 0/3] KVM/ARM: Misc PMU fixes Eric Auger
2019-12-04 20:44 ` Eric Auger
2019-12-04 20:44 ` [RFC 1/3] KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset Eric Auger
2019-12-04 20:44   ` Eric Auger
2019-12-05  9:54   ` Marc Zyngier
2019-12-05  9:54     ` Marc Zyngier
2019-12-06 10:35   ` Andrew Murray
2019-12-06 10:35     ` Andrew Murray
2019-12-04 20:44 ` [RFC 2/3] KVM: arm64: pmu: Fix chained SW_INCR counters Eric Auger
2019-12-04 20:44   ` Eric Auger
2019-12-05  9:43   ` Marc Zyngier
2019-12-05  9:43     ` Marc Zyngier
2019-12-05 14:06     ` Auger Eric
2019-12-05 14:06       ` Auger Eric
2019-12-05 14:52       ` Marc Zyngier
2019-12-05 14:52         ` Marc Zyngier
2019-12-05 19:01         ` Auger Eric
2019-12-05 19:01           ` Auger Eric
2019-12-06  9:56           ` Auger Eric
2019-12-06  9:56             ` Auger Eric
2019-12-06 15:48           ` Andrew Murray
2019-12-06 15:48             ` Andrew Murray
2020-01-19 17:58           ` Marc Zyngier
2020-01-19 17:58             ` Marc Zyngier
2020-01-20 13:30             ` Auger Eric
2020-01-20 13:30               ` Auger Eric
2019-12-06 15:21         ` Andrew Murray
2019-12-06 15:21           ` Andrew Murray
2019-12-06 15:35           ` Marc Zyngier
2019-12-06 15:35             ` Marc Zyngier
2019-12-06 16:02             ` Andrew Murray
2019-12-06 16:02               ` Andrew Murray
2019-12-04 20:44 ` [RFC 3/3] KVM: arm64: pmu: Enforce PMEVTYPER evtCount size Eric Auger
2019-12-04 20:44   ` Eric Auger
2019-12-05  9:02   ` Will Deacon
2019-12-05  9:02     ` Will Deacon
2019-12-05  9:37     ` Auger Eric
2019-12-05  9:37       ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.