All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
@ 2023-03-29  0:21 ` Reiji Watanabe
  0 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29  0:21 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
	Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, Will Deacon, Reiji Watanabe

This series will fix bugs in KVM's handling of PMUSERENR_EL0.

With PMU access support from EL0 [1], the perf subsystem would
set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
a direct access to PMU counters.  However, KVM appears to assume
that the register value is always zero for the host EL0, and has
the following two problems in handling the register.

[A] The host EL0 might lose the direct access to PMU counters, as
    KVM always clears PMUSERENR_EL0 before returning to userspace.

[B] With VHE, the guest EL0 access to PMU counters might be trapped
    to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
    indicates that the guest EL0 has an access to the counters).
    This is because, with VHE, KVM sets ER, CR, SW and EN bits of
    PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
    from the guset EL0 to EL2, but those bits might be cleared by
    the perf subsystem after vcpu_load() (when PMU counters are
    programmed for the vPMU emulation).

Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
The series is based on v6.3-rc4.

[1] https://github.com/torvalds/linux/commit/83a7a4d643d33a8b74a42229346b7ed7139fcef9

Reiji Watanabe (2):
  KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
  KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2

 arch/arm64/include/asm/kvm_host.h       |  3 +++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 28 +++++++++++++------------
 2 files changed, 18 insertions(+), 13 deletions(-)


base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
-- 
2.40.0.348.gf938b09366-goog


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
@ 2023-03-29  0:21 ` Reiji Watanabe
  0 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29  0:21 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
	Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, Will Deacon, Reiji Watanabe

This series will fix bugs in KVM's handling of PMUSERENR_EL0.

With PMU access support from EL0 [1], the perf subsystem would
set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
a direct access to PMU counters.  However, KVM appears to assume
that the register value is always zero for the host EL0, and has
the following two problems in handling the register.

[A] The host EL0 might lose the direct access to PMU counters, as
    KVM always clears PMUSERENR_EL0 before returning to userspace.

[B] With VHE, the guest EL0 access to PMU counters might be trapped
    to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
    indicates that the guest EL0 has an access to the counters).
    This is because, with VHE, KVM sets ER, CR, SW and EN bits of
    PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
    from the guset EL0 to EL2, but those bits might be cleared by
    the perf subsystem after vcpu_load() (when PMU counters are
    programmed for the vPMU emulation).

Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
The series is based on v6.3-rc4.

[1] https://github.com/torvalds/linux/commit/83a7a4d643d33a8b74a42229346b7ed7139fcef9

Reiji Watanabe (2):
  KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
  KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2

 arch/arm64/include/asm/kvm_host.h       |  3 +++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 28 +++++++++++++------------
 2 files changed, 18 insertions(+), 13 deletions(-)


base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
-- 
2.40.0.348.gf938b09366-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
  2023-03-29  0:21 ` Reiji Watanabe
@ 2023-03-29  0:21   ` Reiji Watanabe
  -1 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29  0:21 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
	Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, Will Deacon, Reiji Watanabe

Restore the host's PMUSERENR_EL0 value instead of clearing it,
before returning back to userspace, as the host's EL0 might have
a direct access to PMU registers (some bits of PMUSERENR_EL0
might not be zero).

Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/include/asm/kvm_host.h       | 3 +++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index bcd774d74f34..82220ecec10e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -544,6 +544,9 @@ struct kvm_vcpu_arch {
 
 	/* Per-vcpu CCSIDR override or NULL */
 	u32 *ccsidr;
+
+	/* the value of host's pmuserenr_el0 before guest entry */
+	u64	host_pmuserenr_el0;
 };
 
 /*
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 07d37ff88a3f..44b84fbdde0d 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -82,6 +82,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
 	 */
 	if (kvm_arm_support_pmu_v3()) {
 		write_sysreg(0, pmselr_el0);
+		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
 		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
 	}
 
@@ -106,7 +107,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
 
 	write_sysreg(0, hstr_el2);
 	if (kvm_arm_support_pmu_v3())
-		write_sysreg(0, pmuserenr_el0);
+		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
 
 	if (cpus_have_final_cap(ARM64_SME)) {
 		sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,
-- 
2.40.0.348.gf938b09366-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
@ 2023-03-29  0:21   ` Reiji Watanabe
  0 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29  0:21 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
	Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, Will Deacon, Reiji Watanabe

Restore the host's PMUSERENR_EL0 value instead of clearing it,
before returning back to userspace, as the host's EL0 might have
a direct access to PMU registers (some bits of PMUSERENR_EL0
might not be zero).

Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/include/asm/kvm_host.h       | 3 +++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index bcd774d74f34..82220ecec10e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -544,6 +544,9 @@ struct kvm_vcpu_arch {
 
 	/* Per-vcpu CCSIDR override or NULL */
 	u32 *ccsidr;
+
+	/* the value of host's pmuserenr_el0 before guest entry */
+	u64	host_pmuserenr_el0;
 };
 
 /*
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 07d37ff88a3f..44b84fbdde0d 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -82,6 +82,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
 	 */
 	if (kvm_arm_support_pmu_v3()) {
 		write_sysreg(0, pmselr_el0);
+		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
 		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
 	}
 
@@ -106,7 +107,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
 
 	write_sysreg(0, hstr_el2);
 	if (kvm_arm_support_pmu_v3())
-		write_sysreg(0, pmuserenr_el0);
+		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
 
 	if (cpus_have_final_cap(ARM64_SME)) {
 		sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,
-- 
2.40.0.348.gf938b09366-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
  2023-03-29  0:21 ` Reiji Watanabe
@ 2023-03-29  0:21   ` Reiji Watanabe
  -1 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29  0:21 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
	Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, Will Deacon, Reiji Watanabe

Currently, with VHE, KVM sets ER, CR, SW and EN bits of
PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
are cleared after vcpu_load() (the perf subsystem would do when PMU
counters are programmed for the guest), PMU access from the guest EL0
might be trapped to the guest EL1 directly regardless of the current
PMUSERENR_EL0 value of the vCPU.

With VHE, fix this by setting those bits of the register on every
guest entry (as with nVHE).  Also, opportunistically make the similar
change for PMSELR_EL0, which is cleared by vcpu_load(), to ensure it
is always set to zero on guest entry (PMXEVCNTR_EL0 access might cause
UNDEF at EL1 instead of being trapped to EL2, depending on the value
of PMSELR_EL0).  I think that would be more robust, although I don't
find any kernel code that writes PMSELR_EL0.

Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 29 +++++++++++++------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 44b84fbdde0d..7d39882d8a73 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -74,18 +74,6 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
 	/* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 
-	/*
-	 * Make sure we trap PMU access from EL0 to EL2. Also sanitize
-	 * PMSELR_EL0 to make sure it never contains the cycle
-	 * counter, which could make a PMXEVCNTR_EL0 access UNDEF at
-	 * EL1 instead of being trapped to EL2.
-	 */
-	if (kvm_arm_support_pmu_v3()) {
-		write_sysreg(0, pmselr_el0);
-		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
-		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
-	}
-
 	vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2);
 	write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2);
 
@@ -106,8 +94,6 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
 	write_sysreg(vcpu->arch.mdcr_el2_host, mdcr_el2);
 
 	write_sysreg(0, hstr_el2);
-	if (kvm_arm_support_pmu_v3())
-		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
 
 	if (cpus_have_final_cap(ARM64_SME)) {
 		sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,
@@ -130,6 +116,18 @@ static inline void ___activate_traps(struct kvm_vcpu *vcpu)
 
 	if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN) && (hcr & HCR_VSE))
 		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
+	/*
+	 * Make sure we trap PMU access from EL0 to EL2. Also sanitize
+	 * PMSELR_EL0 to make sure it never contains the cycle
+	 * counter, which could make a PMXEVCNTR_EL0 access UNDEF at
+	 * EL1 instead of being trapped to EL2.
+	 */
+	if (kvm_arm_support_pmu_v3()) {
+		write_sysreg(0, pmselr_el0);
+		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
+		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
+	}
 }
 
 static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
@@ -144,6 +142,9 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 		vcpu->arch.hcr_el2 &= ~HCR_VSE;
 		vcpu->arch.hcr_el2 |= read_sysreg(hcr_el2) & HCR_VSE;
 	}
+
+	if (kvm_arm_support_pmu_v3())
+		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
 }
 
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
-- 
2.40.0.348.gf938b09366-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
@ 2023-03-29  0:21   ` Reiji Watanabe
  0 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29  0:21 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm
  Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
	Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, Will Deacon, Reiji Watanabe

Currently, with VHE, KVM sets ER, CR, SW and EN bits of
PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
are cleared after vcpu_load() (the perf subsystem would do when PMU
counters are programmed for the guest), PMU access from the guest EL0
might be trapped to the guest EL1 directly regardless of the current
PMUSERENR_EL0 value of the vCPU.

With VHE, fix this by setting those bits of the register on every
guest entry (as with nVHE).  Also, opportunistically make the similar
change for PMSELR_EL0, which is cleared by vcpu_load(), to ensure it
is always set to zero on guest entry (PMXEVCNTR_EL0 access might cause
UNDEF at EL1 instead of being trapped to EL2, depending on the value
of PMSELR_EL0).  I think that would be more robust, although I don't
find any kernel code that writes PMSELR_EL0.

Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 29 +++++++++++++------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 44b84fbdde0d..7d39882d8a73 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -74,18 +74,6 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
 	/* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 
-	/*
-	 * Make sure we trap PMU access from EL0 to EL2. Also sanitize
-	 * PMSELR_EL0 to make sure it never contains the cycle
-	 * counter, which could make a PMXEVCNTR_EL0 access UNDEF at
-	 * EL1 instead of being trapped to EL2.
-	 */
-	if (kvm_arm_support_pmu_v3()) {
-		write_sysreg(0, pmselr_el0);
-		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
-		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
-	}
-
 	vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2);
 	write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2);
 
@@ -106,8 +94,6 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
 	write_sysreg(vcpu->arch.mdcr_el2_host, mdcr_el2);
 
 	write_sysreg(0, hstr_el2);
-	if (kvm_arm_support_pmu_v3())
-		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
 
 	if (cpus_have_final_cap(ARM64_SME)) {
 		sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,
@@ -130,6 +116,18 @@ static inline void ___activate_traps(struct kvm_vcpu *vcpu)
 
 	if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN) && (hcr & HCR_VSE))
 		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
+	/*
+	 * Make sure we trap PMU access from EL0 to EL2. Also sanitize
+	 * PMSELR_EL0 to make sure it never contains the cycle
+	 * counter, which could make a PMXEVCNTR_EL0 access UNDEF at
+	 * EL1 instead of being trapped to EL2.
+	 */
+	if (kvm_arm_support_pmu_v3()) {
+		write_sysreg(0, pmselr_el0);
+		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
+		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
+	}
 }
 
 static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
@@ -144,6 +142,9 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 		vcpu->arch.hcr_el2 &= ~HCR_VSE;
 		vcpu->arch.hcr_el2 |= read_sysreg(hcr_el2) & HCR_VSE;
 	}
+
+	if (kvm_arm_support_pmu_v3())
+		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
 }
 
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
-- 
2.40.0.348.gf938b09366-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
  2023-03-29  0:21   ` Reiji Watanabe
@ 2023-03-29  7:31     ` Marc Zyngier
  -1 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2023-03-29  7:31 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon

On Wed, 29 Mar 2023 01:21:35 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Restore the host's PMUSERENR_EL0 value instead of clearing it,
> before returning back to userspace, as the host's EL0 might have
> a direct access to PMU registers (some bits of PMUSERENR_EL0
> might not be zero).
> 
> Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h       | 3 +++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 3 ++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index bcd774d74f34..82220ecec10e 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -544,6 +544,9 @@ struct kvm_vcpu_arch {
>  
>  	/* Per-vcpu CCSIDR override or NULL */
>  	u32 *ccsidr;
> +
> +	/* the value of host's pmuserenr_el0 before guest entry */
> +	u64	host_pmuserenr_el0;

I don't think we need this in each and every vcpu. Why can't this be
placed in struct kvm_host_data and accessed via the per-cpu pointer?
Maybe even use the PMUSERNR_EL0 field in the sysreg array?

There is probably a number of things that we could move there, but
let's start by not adding more unnecessary stuff to the vcpu
structure.

>  };
>  
>  /*
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 07d37ff88a3f..44b84fbdde0d 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -82,6 +82,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
>  	 */
>  	if (kvm_arm_support_pmu_v3()) {
>  		write_sysreg(0, pmselr_el0);
> +		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
>  		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
>  	}
>  
> @@ -106,7 +107,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
>  
>  	write_sysreg(0, hstr_el2);
>  	if (kvm_arm_support_pmu_v3())
> -		write_sysreg(0, pmuserenr_el0);
> +		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
>  
>  	if (cpus_have_final_cap(ARM64_SME)) {
>  		sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
@ 2023-03-29  7:31     ` Marc Zyngier
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2023-03-29  7:31 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon

On Wed, 29 Mar 2023 01:21:35 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Restore the host's PMUSERENR_EL0 value instead of clearing it,
> before returning back to userspace, as the host's EL0 might have
> a direct access to PMU registers (some bits of PMUSERENR_EL0
> might not be zero).
> 
> Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h       | 3 +++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 3 ++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index bcd774d74f34..82220ecec10e 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -544,6 +544,9 @@ struct kvm_vcpu_arch {
>  
>  	/* Per-vcpu CCSIDR override or NULL */
>  	u32 *ccsidr;
> +
> +	/* the value of host's pmuserenr_el0 before guest entry */
> +	u64	host_pmuserenr_el0;

I don't think we need this in each and every vcpu. Why can't this be
placed in struct kvm_host_data and accessed via the per-cpu pointer?
Maybe even use the PMUSERNR_EL0 field in the sysreg array?

There is probably a number of things that we could move there, but
let's start by not adding more unnecessary stuff to the vcpu
structure.

>  };
>  
>  /*
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 07d37ff88a3f..44b84fbdde0d 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -82,6 +82,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
>  	 */
>  	if (kvm_arm_support_pmu_v3()) {
>  		write_sysreg(0, pmselr_el0);
> +		vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
>  		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
>  	}
>  
> @@ -106,7 +107,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
>  
>  	write_sysreg(0, hstr_el2);
>  	if (kvm_arm_support_pmu_v3())
> -		write_sysreg(0, pmuserenr_el0);
> +		write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
>  
>  	if (cpus_have_final_cap(ARM64_SME)) {
>  		sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
  2023-03-29  0:21   ` Reiji Watanabe
@ 2023-03-29 12:03     ` Marc Zyngier
  -1 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2023-03-29 12:03 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon,
	Rob Herring

On Wed, 29 Mar 2023 01:21:36 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> are cleared after vcpu_load() (the perf subsystem would do when PMU
> counters are programmed for the guest), PMU access from the guest EL0
> might be trapped to the guest EL1 directly regardless of the current
> PMUSERENR_EL0 value of the vCPU.

+ RobH.

Is that what is done when the event is created and armv8pmu_start()
called? This is... crap. The EL0 access thing breaks everything, and
nobody tested it with KVM, obviously.

I would be tempted to start mitigating it with the following:

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index dde06c0f97f3..8063525bf3dd 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
 
 static void armv8pmu_start(struct arm_pmu *cpu_pmu)
 {
-	struct perf_event_context *ctx;
-	int nr_user = 0;
+	if (sysctl_perf_user_access) {
+		struct perf_event_context *ctx;
+		int nr_user = 0;
 
-	ctx = perf_cpu_task_ctx();
-	if (ctx)
-		nr_user = ctx->nr_user;
+		ctx = perf_cpu_task_ctx();
+		if (ctx)
+			nr_user = ctx->nr_user;
 
-	if (sysctl_perf_user_access && nr_user)
-		armv8pmu_enable_user_access(cpu_pmu);
-	else
-		armv8pmu_disable_user_access();
+		if (nr_user)
+			armv8pmu_enable_user_access(cpu_pmu);
+		else
+			armv8pmu_disable_user_access();
+	}
 
 	/* Enable all counters */
 	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);

but that's obviously not enough as we want it to work with EL0 access
enabled on the host as well.

What we miss is something that tells the PMU code "we're in a context
where host userspace isn't present", and this would be completely
skipped, relying on KVM to restore the appropriate state on
vcpu_put(). But then the IPI stuff that controls EL0 can always come
in and wreck things. Gahhh...

I'm a bit reluctant to use the "save/restore all the time" hammer,
because it only hides that the EL0 counter infrastructure is a bit
broken.

> With VHE, fix this by setting those bits of the register on every
> guest entry (as with nVHE).  Also, opportunistically make the similar
> change for PMSELR_EL0, which is cleared by vcpu_load(), to ensure it
> is always set to zero on guest entry (PMXEVCNTR_EL0 access might cause
> UNDEF at EL1 instead of being trapped to EL2, depending on the value
> of PMSELR_EL0).  I think that would be more robust, although I don't
> find any kernel code that writes PMSELR_EL0.

This was changed a while ago to avoid using the selection register,
see 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection"), and the
rationale behind the reset of PMSELR_EL0 in 21cbe3cc8a48 ("arm64: KVM:
pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest").

We *could* simply drop this zeroing of PMSELR_EL0 now that there is
nothing else host-side that writes to it. But we need to agree on how
to fix the above first.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
@ 2023-03-29 12:03     ` Marc Zyngier
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2023-03-29 12:03 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon,
	Rob Herring

On Wed, 29 Mar 2023 01:21:36 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> are cleared after vcpu_load() (the perf subsystem would do when PMU
> counters are programmed for the guest), PMU access from the guest EL0
> might be trapped to the guest EL1 directly regardless of the current
> PMUSERENR_EL0 value of the vCPU.

+ RobH.

Is that what is done when the event is created and armv8pmu_start()
called? This is... crap. The EL0 access thing breaks everything, and
nobody tested it with KVM, obviously.

I would be tempted to start mitigating it with the following:

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index dde06c0f97f3..8063525bf3dd 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
 
 static void armv8pmu_start(struct arm_pmu *cpu_pmu)
 {
-	struct perf_event_context *ctx;
-	int nr_user = 0;
+	if (sysctl_perf_user_access) {
+		struct perf_event_context *ctx;
+		int nr_user = 0;
 
-	ctx = perf_cpu_task_ctx();
-	if (ctx)
-		nr_user = ctx->nr_user;
+		ctx = perf_cpu_task_ctx();
+		if (ctx)
+			nr_user = ctx->nr_user;
 
-	if (sysctl_perf_user_access && nr_user)
-		armv8pmu_enable_user_access(cpu_pmu);
-	else
-		armv8pmu_disable_user_access();
+		if (nr_user)
+			armv8pmu_enable_user_access(cpu_pmu);
+		else
+			armv8pmu_disable_user_access();
+	}
 
 	/* Enable all counters */
 	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);

but that's obviously not enough as we want it to work with EL0 access
enabled on the host as well.

What we miss is something that tells the PMU code "we're in a context
where host userspace isn't present", and this would be completely
skipped, relying on KVM to restore the appropriate state on
vcpu_put(). But then the IPI stuff that controls EL0 can always come
in and wreck things. Gahhh...

I'm a bit reluctant to use the "save/restore all the time" hammer,
because it only hides that the EL0 counter infrastructure is a bit
broken.

> With VHE, fix this by setting those bits of the register on every
> guest entry (as with nVHE).  Also, opportunistically make the similar
> change for PMSELR_EL0, which is cleared by vcpu_load(), to ensure it
> is always set to zero on guest entry (PMXEVCNTR_EL0 access might cause
> UNDEF at EL1 instead of being trapped to EL2, depending on the value
> of PMSELR_EL0).  I think that would be more robust, although I don't
> find any kernel code that writes PMSELR_EL0.

This was changed a while ago to avoid using the selection register,
see 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection"), and the
rationale behind the reset of PMSELR_EL0 in 21cbe3cc8a48 ("arm64: KVM:
pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest").

We *could* simply drop this zeroing of PMSELR_EL0 now that there is
nothing else host-side that writes to it. But we need to agree on how
to fix the above first.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
  2023-03-29  7:31     ` Marc Zyngier
@ 2023-03-29 16:28       ` Reiji Watanabe
  -1 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29 16:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon

Hi Marc,

On Wed, Mar 29, 2023 at 08:31:24AM +0100, Marc Zyngier wrote:
> On Wed, 29 Mar 2023 01:21:35 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> >
> > Restore the host's PMUSERENR_EL0 value instead of clearing it,
> > before returning back to userspace, as the host's EL0 might have
> > a direct access to PMU registers (some bits of PMUSERENR_EL0
> > might not be zero).
> >
> > Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h       | 3 +++
> >  arch/arm64/kvm/hyp/include/hyp/switch.h | 3 ++-
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index bcd774d74f34..82220ecec10e 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -544,6 +544,9 @@ struct kvm_vcpu_arch {
> >
> >     /* Per-vcpu CCSIDR override or NULL */
> >     u32 *ccsidr;
> > +
> > +   /* the value of host's pmuserenr_el0 before guest entry */
> > +   u64     host_pmuserenr_el0;
>
> I don't think we need this in each and every vcpu. Why can't this be
> placed in struct kvm_host_data and accessed via the per-cpu pointer?
> Maybe even use the PMUSERNR_EL0 field in the sysreg array?

Thank you for the nice suggestion.
Indeed, that would be better.  I will fix it in v2.

>
> There is probably a number of things that we could move there, but
> let's start by not adding more unnecessary stuff to the vcpu
> structure.

Yeah, I agree.

Thank you,
Reiji



>
> >  };
> >
> >  /*
> > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > index 07d37ff88a3f..44b84fbdde0d 100644
> > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > @@ -82,6 +82,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
> >      */
> >     if (kvm_arm_support_pmu_v3()) {
> >             write_sysreg(0, pmselr_el0);
> > +           vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
> >             write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
> >     }
> >
> > @@ -106,7 +107,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
> >
> >     write_sysreg(0, hstr_el2);
> >     if (kvm_arm_support_pmu_v3())
> > -           write_sysreg(0, pmuserenr_el0);
> > +           write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
> >
> >     if (cpus_have_final_cap(ARM64_SME)) {
> >             sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,
>
> Thanks,
>
>       M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
@ 2023-03-29 16:28       ` Reiji Watanabe
  0 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-29 16:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon

Hi Marc,

On Wed, Mar 29, 2023 at 08:31:24AM +0100, Marc Zyngier wrote:
> On Wed, 29 Mar 2023 01:21:35 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> >
> > Restore the host's PMUSERENR_EL0 value instead of clearing it,
> > before returning back to userspace, as the host's EL0 might have
> > a direct access to PMU registers (some bits of PMUSERENR_EL0
> > might not be zero).
> >
> > Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h       | 3 +++
> >  arch/arm64/kvm/hyp/include/hyp/switch.h | 3 ++-
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index bcd774d74f34..82220ecec10e 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -544,6 +544,9 @@ struct kvm_vcpu_arch {
> >
> >     /* Per-vcpu CCSIDR override or NULL */
> >     u32 *ccsidr;
> > +
> > +   /* the value of host's pmuserenr_el0 before guest entry */
> > +   u64     host_pmuserenr_el0;
>
> I don't think we need this in each and every vcpu. Why can't this be
> placed in struct kvm_host_data and accessed via the per-cpu pointer?
> Maybe even use the PMUSERNR_EL0 field in the sysreg array?

Thank you for the nice suggestion.
Indeed, that would be better.  I will fix it in v2.

>
> There is probably a number of things that we could move there, but
> let's start by not adding more unnecessary stuff to the vcpu
> structure.

Yeah, I agree.

Thank you,
Reiji



>
> >  };
> >
> >  /*
> > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > index 07d37ff88a3f..44b84fbdde0d 100644
> > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > @@ -82,6 +82,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
> >      */
> >     if (kvm_arm_support_pmu_v3()) {
> >             write_sysreg(0, pmselr_el0);
> > +           vcpu->arch.host_pmuserenr_el0 = read_sysreg(pmuserenr_el0);
> >             write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
> >     }
> >
> > @@ -106,7 +107,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
> >
> >     write_sysreg(0, hstr_el2);
> >     if (kvm_arm_support_pmu_v3())
> > -           write_sysreg(0, pmuserenr_el0);
> > +           write_sysreg(vcpu->arch.host_pmuserenr_el0, pmuserenr_el0);
> >
> >     if (cpus_have_final_cap(ARM64_SME)) {
> >             sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,
>
> Thanks,
>
>       M.
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
  2023-03-29 12:03     ` Marc Zyngier
@ 2023-03-30  3:55       ` Reiji Watanabe
  -1 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-30  3:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon,
	Rob Herring

Hi Marc,

On Wed, Mar 29, 2023 at 01:03:18PM +0100, Marc Zyngier wrote:
> On Wed, 29 Mar 2023 01:21:36 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> > 
> > Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> > PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> > are cleared after vcpu_load() (the perf subsystem would do when PMU
> > counters are programmed for the guest), PMU access from the guest EL0
> > might be trapped to the guest EL1 directly regardless of the current
> > PMUSERENR_EL0 value of the vCPU.
> 
> + RobH.
> 
> Is that what is done when the event is created and armv8pmu_start()
> called? 

Yes, that is it.

> This is... crap. The EL0 access thing breaks everything, and
> nobody tested it with KVM, obviously.

It was a bit shocking, as we detected those EL0 related
issues just with the first EL0 PMU test we ran...

> 
> I would be tempted to start mitigating it with the following:
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index dde06c0f97f3..8063525bf3dd 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
>  
>  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>  {
> -	struct perf_event_context *ctx;
> -	int nr_user = 0;
> +	if (sysctl_perf_user_access) {
> +		struct perf_event_context *ctx;
> +		int nr_user = 0;
>  
> -	ctx = perf_cpu_task_ctx();
> -	if (ctx)
> -		nr_user = ctx->nr_user;
> +		ctx = perf_cpu_task_ctx();
> +		if (ctx)
> +			nr_user = ctx->nr_user;
>  
> -	if (sysctl_perf_user_access && nr_user)
> -		armv8pmu_enable_user_access(cpu_pmu);
> -	else
> -		armv8pmu_disable_user_access();
> +		if (nr_user)
> +			armv8pmu_enable_user_access(cpu_pmu);
> +		else
> +			armv8pmu_disable_user_access();
> +	}
>  
>  	/* Enable all counters */
>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> 
> but that's obviously not enough as we want it to work with EL0 access
> enabled on the host as well.

Right, also with the change above, since PMUSERENR_EL0 isn't explicitly
cleared, a perf client (EL0) might have an access to counters.
(with the current code, a non-perf client might have an access to
counters though)


> What we miss is something that tells the PMU code "we're in a context
> where host userspace isn't present", and this would be completely

Could you please elaborate ?
I'm not sure if I understand the above correctly.
Since the task actually has the host userspace, which could be using
the PMU, and both the host EL0 and guest EL0 events are associated with
the task context of the perf_cpu_context, I think the "something" we
want to say would be subtle (I would assume it is similar to what we
meant with exclude_guest == 0 && exclude_host == 1 in the event attr
for the guest, in terms of events?).


> skipped, relying on KVM to restore the appropriate state on
> vcpu_put(). But then the IPI stuff that controls EL0 can always come
> in and wreck things. Gahhh...
> 
> I'm a bit reluctant to use the "save/restore all the time" hammer,
> because it only hides that the EL0 counter infrastructure is a bit
> broken.

Looking at the current code only, since KVM directly silently
modifies the PMU register (PMUSERENR_EL0) even though KVM is
a client of the perf in general, my original thought was
it made sense to have KVM restore the register value.


> > With VHE, fix this by setting those bits of the register on every
> > guest entry (as with nVHE).  Also, opportunistically make the similar
> > change for PMSELR_EL0, which is cleared by vcpu_load(), to ensure it
> > is always set to zero on guest entry (PMXEVCNTR_EL0 access might cause
> > UNDEF at EL1 instead of being trapped to EL2, depending on the value
> > of PMSELR_EL0).  I think that would be more robust, although I don't
> > find any kernel code that writes PMSELR_EL0.
> 
> This was changed a while ago to avoid using the selection register,
> see 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection"), and the
> rationale behind the reset of PMSELR_EL0 in 21cbe3cc8a48 ("arm64: KVM:
> pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest").
> 
> We *could* simply drop this zeroing of PMSELR_EL0 now that there is
> nothing else host-side that writes to it. But we need to agree on how
> to fix the above first.

We don't have to clear PMSELR_EL0 on every guest entry,
but I would think we still should do that at least in vcpu_load(),
since now the host EL0 could have a direct access to PMSELR_EL0.
(should be fine with the sane EL0 though)

Thank you,
Reiji

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
@ 2023-03-30  3:55       ` Reiji Watanabe
  0 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-03-30  3:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, kvmarm, kvm, linux-arm-kernel, James Morse,
	Alexandru Elisei, Zenghui Yu, Suzuki K Poulose, Paolo Bonzini,
	Ricardo Koller, Jing Zhang, Raghavendra Rao Anata, Will Deacon,
	Rob Herring

Hi Marc,

On Wed, Mar 29, 2023 at 01:03:18PM +0100, Marc Zyngier wrote:
> On Wed, 29 Mar 2023 01:21:36 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> > 
> > Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> > PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> > are cleared after vcpu_load() (the perf subsystem would do when PMU
> > counters are programmed for the guest), PMU access from the guest EL0
> > might be trapped to the guest EL1 directly regardless of the current
> > PMUSERENR_EL0 value of the vCPU.
> 
> + RobH.
> 
> Is that what is done when the event is created and armv8pmu_start()
> called? 

Yes, that is it.

> This is... crap. The EL0 access thing breaks everything, and
> nobody tested it with KVM, obviously.

It was a bit shocking, as we detected those EL0 related
issues just with the first EL0 PMU test we ran...

> 
> I would be tempted to start mitigating it with the following:
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index dde06c0f97f3..8063525bf3dd 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
>  
>  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>  {
> -	struct perf_event_context *ctx;
> -	int nr_user = 0;
> +	if (sysctl_perf_user_access) {
> +		struct perf_event_context *ctx;
> +		int nr_user = 0;
>  
> -	ctx = perf_cpu_task_ctx();
> -	if (ctx)
> -		nr_user = ctx->nr_user;
> +		ctx = perf_cpu_task_ctx();
> +		if (ctx)
> +			nr_user = ctx->nr_user;
>  
> -	if (sysctl_perf_user_access && nr_user)
> -		armv8pmu_enable_user_access(cpu_pmu);
> -	else
> -		armv8pmu_disable_user_access();
> +		if (nr_user)
> +			armv8pmu_enable_user_access(cpu_pmu);
> +		else
> +			armv8pmu_disable_user_access();
> +	}
>  
>  	/* Enable all counters */
>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> 
> but that's obviously not enough as we want it to work with EL0 access
> enabled on the host as well.

Right, also with the change above, since PMUSERENR_EL0 isn't explicitly
cleared, a perf client (EL0) might have an access to counters.
(with the current code, a non-perf client might have an access to
counters though)


> What we miss is something that tells the PMU code "we're in a context
> where host userspace isn't present", and this would be completely

Could you please elaborate ?
I'm not sure if I understand the above correctly.
Since the task actually has the host userspace, which could be using
the PMU, and both the host EL0 and guest EL0 events are associated with
the task context of the perf_cpu_context, I think the "something" we
want to say would be subtle (I would assume it is similar to what we
meant with exclude_guest == 0 && exclude_host == 1 in the event attr
for the guest, in terms of events?).


> skipped, relying on KVM to restore the appropriate state on
> vcpu_put(). But then the IPI stuff that controls EL0 can always come
> in and wreck things. Gahhh...
> 
> I'm a bit reluctant to use the "save/restore all the time" hammer,
> because it only hides that the EL0 counter infrastructure is a bit
> broken.

Looking at the current code only, since KVM directly silently
modifies the PMU register (PMUSERENR_EL0) even though KVM is
a client of the perf in general, my original thought was
it made sense to have KVM restore the register value.


> > With VHE, fix this by setting those bits of the register on every
> > guest entry (as with nVHE).  Also, opportunistically make the similar
> > change for PMSELR_EL0, which is cleared by vcpu_load(), to ensure it
> > is always set to zero on guest entry (PMXEVCNTR_EL0 access might cause
> > UNDEF at EL1 instead of being trapped to EL2, depending on the value
> > of PMSELR_EL0).  I think that would be more robust, although I don't
> > find any kernel code that writes PMSELR_EL0.
> 
> This was changed a while ago to avoid using the selection register,
> see 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection"), and the
> rationale behind the reset of PMSELR_EL0 in 21cbe3cc8a48 ("arm64: KVM:
> pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest").
> 
> We *could* simply drop this zeroing of PMSELR_EL0 now that there is
> nothing else host-side that writes to it. But we need to agree on how
> to fix the above first.

We don't have to clear PMSELR_EL0 on every guest entry,
but I would think we still should do that at least in vcpu_load(),
since now the host EL0 could have a direct access to PMSELR_EL0.
(should be fine with the sane EL0 though)

Thank you,
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
  2023-03-29  0:21 ` Reiji Watanabe
@ 2023-04-04 10:05   ` Marc Zyngier
  -1 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2023-04-04 10:05 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Reiji Watanabe, Oliver Upton, kvmarm, kvm, linux-arm-kernel,
	James Morse, Alexandru Elisei, Zenghui Yu, Suzuki K Poulose,
	Paolo Bonzini, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	Will Deacon

+ Mark, as we just discussed this.

On Wed, 29 Mar 2023 01:21:34 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> This series will fix bugs in KVM's handling of PMUSERENR_EL0.
> 
> With PMU access support from EL0 [1], the perf subsystem would
> set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> a direct access to PMU counters.  However, KVM appears to assume
> that the register value is always zero for the host EL0, and has
> the following two problems in handling the register.
> 
> [A] The host EL0 might lose the direct access to PMU counters, as
>     KVM always clears PMUSERENR_EL0 before returning to userspace.
> 
> [B] With VHE, the guest EL0 access to PMU counters might be trapped
>     to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
>     indicates that the guest EL0 has an access to the counters).
>     This is because, with VHE, KVM sets ER, CR, SW and EN bits of
>     PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
>     from the guset EL0 to EL2, but those bits might be cleared by
>     the perf subsystem after vcpu_load() (when PMU counters are
>     programmed for the vPMU emulation).
> 
> Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> The series is based on v6.3-rc4.
> 
> [1] https://github.com/torvalds/linux/commit/83a7a4d643d33a8b74a42229346b7ed7139fcef9
> 
> Reiji Watanabe (2):
>   KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
>   KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
> 
>  arch/arm64/include/asm/kvm_host.h       |  3 +++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 28 +++++++++++++------------
>  2 files changed, 18 insertions(+), 13 deletions(-)
> 
> 
> base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
> -- 
> 2.40.0.348.gf938b09366-goog
> 
> 

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
@ 2023-04-04 10:05   ` Marc Zyngier
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2023-04-04 10:05 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Reiji Watanabe, Oliver Upton, kvmarm, kvm, linux-arm-kernel,
	James Morse, Alexandru Elisei, Zenghui Yu, Suzuki K Poulose,
	Paolo Bonzini, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	Will Deacon

+ Mark, as we just discussed this.

On Wed, 29 Mar 2023 01:21:34 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> This series will fix bugs in KVM's handling of PMUSERENR_EL0.
> 
> With PMU access support from EL0 [1], the perf subsystem would
> set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> a direct access to PMU counters.  However, KVM appears to assume
> that the register value is always zero for the host EL0, and has
> the following two problems in handling the register.
> 
> [A] The host EL0 might lose the direct access to PMU counters, as
>     KVM always clears PMUSERENR_EL0 before returning to userspace.
> 
> [B] With VHE, the guest EL0 access to PMU counters might be trapped
>     to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
>     indicates that the guest EL0 has an access to the counters).
>     This is because, with VHE, KVM sets ER, CR, SW and EN bits of
>     PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
>     from the guset EL0 to EL2, but those bits might be cleared by
>     the perf subsystem after vcpu_load() (when PMU counters are
>     programmed for the vPMU emulation).
> 
> Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> The series is based on v6.3-rc4.
> 
> [1] https://github.com/torvalds/linux/commit/83a7a4d643d33a8b74a42229346b7ed7139fcef9
> 
> Reiji Watanabe (2):
>   KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
>   KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
> 
>  arch/arm64/include/asm/kvm_host.h       |  3 +++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 28 +++++++++++++------------
>  2 files changed, 18 insertions(+), 13 deletions(-)
> 
> 
> base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
> -- 
> 2.40.0.348.gf938b09366-goog
> 
> 

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
  2023-03-29 12:03     ` Marc Zyngier
@ 2023-04-04 14:25       ` Mark Rutland
  -1 siblings, 0 replies; 20+ messages in thread
From: Mark Rutland @ 2023-04-04 14:25 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Reiji Watanabe, Oliver Upton, kvmarm, kvm, linux-arm-kernel,
	James Morse, Alexandru Elisei, Zenghui Yu, Suzuki K Poulose,
	Paolo Bonzini, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	Will Deacon, Rob Herring

On Wed, Mar 29, 2023 at 01:03:18PM +0100, Marc Zyngier wrote:
> On Wed, 29 Mar 2023 01:21:36 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> > 
> > Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> > PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> > are cleared after vcpu_load() (the perf subsystem would do when PMU
> > counters are programmed for the guest), PMU access from the guest EL0
> > might be trapped to the guest EL1 directly regardless of the current
> > PMUSERENR_EL0 value of the vCPU.
> 
> + RobH.
> 
> Is that what is done when the event is created and armv8pmu_start()
> called? This is... crap. The EL0 access thing breaks everything, and
> nobody tested it with KVM, obviously.
> 
> I would be tempted to start mitigating it with the following:
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index dde06c0f97f3..8063525bf3dd 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
>  
>  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>  {
> -	struct perf_event_context *ctx;
> -	int nr_user = 0;
> +	if (sysctl_perf_user_access) {
> +		struct perf_event_context *ctx;
> +		int nr_user = 0;
>  
> -	ctx = perf_cpu_task_ctx();
> -	if (ctx)
> -		nr_user = ctx->nr_user;
> +		ctx = perf_cpu_task_ctx();
> +		if (ctx)
> +			nr_user = ctx->nr_user;
>  
> -	if (sysctl_perf_user_access && nr_user)
> -		armv8pmu_enable_user_access(cpu_pmu);
> -	else
> -		armv8pmu_disable_user_access();
> +		if (nr_user)
> +			armv8pmu_enable_user_access(cpu_pmu);
> +		else
> +			armv8pmu_disable_user_access();
> +	}
>  
>  	/* Enable all counters */
>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> 
> but that's obviously not enough as we want it to work with EL0 access
> enabled on the host as well.
> 
> What we miss is something that tells the PMU code "we're in a context
> where host userspace isn't present", and this would be completely
> skipped, relying on KVM to restore the appropriate state on
> vcpu_put(). But then the IPI stuff that controls EL0 can always come
> in and wreck things. Gahhh...

AFAICT the perf code only writes to PMUSERENR_EL0 in contexts where IRQs (and
hence preemption) are disabled, so as long as we have a shadow of the host
PMUSERENR value somewhere, I think we can update the perf code with something
like the below?

... unless the KVM code is interruptible before saving the host value, or after
restoring it?

Thanks,
Mark.

---->8----
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index dde06c0f97f3e..bdab3d5cbb5e3 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -741,11 +741,26 @@ static inline u32 armv8pmu_getreset_flags(void)
        return value;
 }
 
-static void armv8pmu_disable_user_access(void)
+static void update_pmuserenr(u64 val)
 {
+       lockdep_assert_irqs_disabled();
+
+       if (IS_ENABLED(CONFIG_KVM)) {
+               struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
+               if (vcpu) {
+                       vcpu->arch.pmuserenr_host = val;
+                       return;
+               }
+       }
+
        write_sysreg(0, pmuserenr_el0);
 }
 
+static void armv8pmu_disable_user_access(void)
+{
+       update_pmuserenr(0);
+}
+
 static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
 {
        int i;
@@ -759,8 +774,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
                        armv8pmu_write_evcntr(i, 0);
        }
 
-       write_sysreg(0, pmuserenr_el0);
-       write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
+       update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
 }
 
 static void armv8pmu_enable_event(struct perf_event *event)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
@ 2023-04-04 14:25       ` Mark Rutland
  0 siblings, 0 replies; 20+ messages in thread
From: Mark Rutland @ 2023-04-04 14:25 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Reiji Watanabe, Oliver Upton, kvmarm, kvm, linux-arm-kernel,
	James Morse, Alexandru Elisei, Zenghui Yu, Suzuki K Poulose,
	Paolo Bonzini, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	Will Deacon, Rob Herring

On Wed, Mar 29, 2023 at 01:03:18PM +0100, Marc Zyngier wrote:
> On Wed, 29 Mar 2023 01:21:36 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> > 
> > Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> > PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> > are cleared after vcpu_load() (the perf subsystem would do when PMU
> > counters are programmed for the guest), PMU access from the guest EL0
> > might be trapped to the guest EL1 directly regardless of the current
> > PMUSERENR_EL0 value of the vCPU.
> 
> + RobH.
> 
> Is that what is done when the event is created and armv8pmu_start()
> called? This is... crap. The EL0 access thing breaks everything, and
> nobody tested it with KVM, obviously.
> 
> I would be tempted to start mitigating it with the following:
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index dde06c0f97f3..8063525bf3dd 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
>  
>  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>  {
> -	struct perf_event_context *ctx;
> -	int nr_user = 0;
> +	if (sysctl_perf_user_access) {
> +		struct perf_event_context *ctx;
> +		int nr_user = 0;
>  
> -	ctx = perf_cpu_task_ctx();
> -	if (ctx)
> -		nr_user = ctx->nr_user;
> +		ctx = perf_cpu_task_ctx();
> +		if (ctx)
> +			nr_user = ctx->nr_user;
>  
> -	if (sysctl_perf_user_access && nr_user)
> -		armv8pmu_enable_user_access(cpu_pmu);
> -	else
> -		armv8pmu_disable_user_access();
> +		if (nr_user)
> +			armv8pmu_enable_user_access(cpu_pmu);
> +		else
> +			armv8pmu_disable_user_access();
> +	}
>  
>  	/* Enable all counters */
>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> 
> but that's obviously not enough as we want it to work with EL0 access
> enabled on the host as well.
> 
> What we miss is something that tells the PMU code "we're in a context
> where host userspace isn't present", and this would be completely
> skipped, relying on KVM to restore the appropriate state on
> vcpu_put(). But then the IPI stuff that controls EL0 can always come
> in and wreck things. Gahhh...

AFAICT the perf code only writes to PMUSERENR_EL0 in contexts where IRQs (and
hence preemption) are disabled, so as long as we have a shadow of the host
PMUSERENR value somewhere, I think we can update the perf code with something
like the below?

... unless the KVM code is interruptible before saving the host value, or after
restoring it?

Thanks,
Mark.

---->8----
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index dde06c0f97f3e..bdab3d5cbb5e3 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -741,11 +741,26 @@ static inline u32 armv8pmu_getreset_flags(void)
        return value;
 }
 
-static void armv8pmu_disable_user_access(void)
+static void update_pmuserenr(u64 val)
 {
+       lockdep_assert_irqs_disabled();
+
+       if (IS_ENABLED(CONFIG_KVM)) {
+               struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
+               if (vcpu) {
+                       vcpu->arch.pmuserenr_host = val;
+                       return;
+               }
+       }
+
        write_sysreg(0, pmuserenr_el0);
 }
 
+static void armv8pmu_disable_user_access(void)
+{
+       update_pmuserenr(0);
+}
+
 static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
 {
        int i;
@@ -759,8 +774,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
                        armv8pmu_write_evcntr(i, 0);
        }
 
-       write_sysreg(0, pmuserenr_el0);
-       write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
+       update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
 }
 
 static void armv8pmu_enable_event(struct perf_event *event)


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
  2023-04-04 14:25       ` Mark Rutland
@ 2023-04-06  2:28         ` Reiji Watanabe
  -1 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-04-06  2:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marc Zyngier, Oliver Upton, kvmarm, kvm, linux-arm-kernel,
	James Morse, Alexandru Elisei, Zenghui Yu, Suzuki K Poulose,
	Paolo Bonzini, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	Will Deacon, Rob Herring

On Tue, Apr 04, 2023 at 03:25:27PM +0100, Mark Rutland wrote:
> On Wed, Mar 29, 2023 at 01:03:18PM +0100, Marc Zyngier wrote:
> > On Wed, 29 Mar 2023 01:21:36 +0100,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > > 
> > > Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> > > PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> > > are cleared after vcpu_load() (the perf subsystem would do when PMU
> > > counters are programmed for the guest), PMU access from the guest EL0
> > > might be trapped to the guest EL1 directly regardless of the current
> > > PMUSERENR_EL0 value of the vCPU.
> > 
> > + RobH.
> > 
> > Is that what is done when the event is created and armv8pmu_start()
> > called? This is... crap. The EL0 access thing breaks everything, and
> > nobody tested it with KVM, obviously.
> > 
> > I would be tempted to start mitigating it with the following:
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index dde06c0f97f3..8063525bf3dd 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
> >  
> >  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
> >  {
> > -	struct perf_event_context *ctx;
> > -	int nr_user = 0;
> > +	if (sysctl_perf_user_access) {
> > +		struct perf_event_context *ctx;
> > +		int nr_user = 0;
> >  
> > -	ctx = perf_cpu_task_ctx();
> > -	if (ctx)
> > -		nr_user = ctx->nr_user;
> > +		ctx = perf_cpu_task_ctx();
> > +		if (ctx)
> > +			nr_user = ctx->nr_user;
> >  
> > -	if (sysctl_perf_user_access && nr_user)
> > -		armv8pmu_enable_user_access(cpu_pmu);
> > -	else
> > -		armv8pmu_disable_user_access();
> > +		if (nr_user)
> > +			armv8pmu_enable_user_access(cpu_pmu);
> > +		else
> > +			armv8pmu_disable_user_access();
> > +	}
> >  
> >  	/* Enable all counters */
> >  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> > 
> > but that's obviously not enough as we want it to work with EL0 access
> > enabled on the host as well.
> > 
> > What we miss is something that tells the PMU code "we're in a context
> > where host userspace isn't present", and this would be completely
> > skipped, relying on KVM to restore the appropriate state on
> > vcpu_put(). But then the IPI stuff that controls EL0 can always come
> > in and wreck things. Gahhh...
> 
> AFAICT the perf code only writes to PMUSERENR_EL0 in contexts where IRQs (and
> hence preemption) are disabled, so as long as we have a shadow of the host
> PMUSERENR value somewhere, I think we can update the perf code with something
> like the below?
> 
> ... unless the KVM code is interruptible before saving the host value, or after
> restoring it?

Thank you for the suggestion.
I will update my patch based on the suggestion.

Thank you,
Reiji


> 
> Thanks,
> Mark.
> 
> ---->8----
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index dde06c0f97f3e..bdab3d5cbb5e3 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -741,11 +741,26 @@ static inline u32 armv8pmu_getreset_flags(void)
>         return value;
>  }
>  
> -static void armv8pmu_disable_user_access(void)
> +static void update_pmuserenr(u64 val)
>  {
> +       lockdep_assert_irqs_disabled();
> +
> +       if (IS_ENABLED(CONFIG_KVM)) {
> +               struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
> +               if (vcpu) {
> +                       vcpu->arch.pmuserenr_host = val;
> +                       return;
> +               }
> +       }
> +
>         write_sysreg(0, pmuserenr_el0);
>  }
>  
> +static void armv8pmu_disable_user_access(void)
> +{
> +       update_pmuserenr(0);
> +}
> +
>  static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
>  {
>         int i;
> @@ -759,8 +774,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
>                         armv8pmu_write_evcntr(i, 0);
>         }
>  
> -       write_sysreg(0, pmuserenr_el0);
> -       write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
> +       update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
>  }
>  
>  static void armv8pmu_enable_event(struct perf_event *event)
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2
@ 2023-04-06  2:28         ` Reiji Watanabe
  0 siblings, 0 replies; 20+ messages in thread
From: Reiji Watanabe @ 2023-04-06  2:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marc Zyngier, Oliver Upton, kvmarm, kvm, linux-arm-kernel,
	James Morse, Alexandru Elisei, Zenghui Yu, Suzuki K Poulose,
	Paolo Bonzini, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	Will Deacon, Rob Herring

On Tue, Apr 04, 2023 at 03:25:27PM +0100, Mark Rutland wrote:
> On Wed, Mar 29, 2023 at 01:03:18PM +0100, Marc Zyngier wrote:
> > On Wed, 29 Mar 2023 01:21:36 +0100,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > > 
> > > Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> > > PMUSERENR_EL0 to 1 on vcpu_load().  So, if the value of those bits
> > > are cleared after vcpu_load() (the perf subsystem would do when PMU
> > > counters are programmed for the guest), PMU access from the guest EL0
> > > might be trapped to the guest EL1 directly regardless of the current
> > > PMUSERENR_EL0 value of the vCPU.
> > 
> > + RobH.
> > 
> > Is that what is done when the event is created and armv8pmu_start()
> > called? This is... crap. The EL0 access thing breaks everything, and
> > nobody tested it with KVM, obviously.
> > 
> > I would be tempted to start mitigating it with the following:
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index dde06c0f97f3..8063525bf3dd 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -806,17 +806,19 @@ static void armv8pmu_disable_event(struct perf_event *event)
> >  
> >  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
> >  {
> > -	struct perf_event_context *ctx;
> > -	int nr_user = 0;
> > +	if (sysctl_perf_user_access) {
> > +		struct perf_event_context *ctx;
> > +		int nr_user = 0;
> >  
> > -	ctx = perf_cpu_task_ctx();
> > -	if (ctx)
> > -		nr_user = ctx->nr_user;
> > +		ctx = perf_cpu_task_ctx();
> > +		if (ctx)
> > +			nr_user = ctx->nr_user;
> >  
> > -	if (sysctl_perf_user_access && nr_user)
> > -		armv8pmu_enable_user_access(cpu_pmu);
> > -	else
> > -		armv8pmu_disable_user_access();
> > +		if (nr_user)
> > +			armv8pmu_enable_user_access(cpu_pmu);
> > +		else
> > +			armv8pmu_disable_user_access();
> > +	}
> >  
> >  	/* Enable all counters */
> >  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> > 
> > but that's obviously not enough as we want it to work with EL0 access
> > enabled on the host as well.
> > 
> > What we miss is something that tells the PMU code "we're in a context
> > where host userspace isn't present", and this would be completely
> > skipped, relying on KVM to restore the appropriate state on
> > vcpu_put(). But then the IPI stuff that controls EL0 can always come
> > in and wreck things. Gahhh...
> 
> AFAICT the perf code only writes to PMUSERENR_EL0 in contexts where IRQs (and
> hence preemption) are disabled, so as long as we have a shadow of the host
> PMUSERENR value somewhere, I think we can update the perf code with something
> like the below?
> 
> ... unless the KVM code is interruptible before saving the host value, or after
> restoring it?

Thank you for the suggestion.
I will update my patch based on the suggestion.

Thank you,
Reiji


> 
> Thanks,
> Mark.
> 
> ---->8----
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index dde06c0f97f3e..bdab3d5cbb5e3 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -741,11 +741,26 @@ static inline u32 armv8pmu_getreset_flags(void)
>         return value;
>  }
>  
> -static void armv8pmu_disable_user_access(void)
> +static void update_pmuserenr(u64 val)
>  {
> +       lockdep_assert_irqs_disabled();
> +
> +       if (IS_ENABLED(CONFIG_KVM)) {
> +               struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
> +               if (vcpu) {
> +                       vcpu->arch.pmuserenr_host = val;
> +                       return;
> +               }
> +       }
> +
>         write_sysreg(0, pmuserenr_el0);
>  }
>  
> +static void armv8pmu_disable_user_access(void)
> +{
> +       update_pmuserenr(0);
> +}
> +
>  static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
>  {
>         int i;
> @@ -759,8 +774,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
>                         armv8pmu_write_evcntr(i, 0);
>         }
>  
> -       write_sysreg(0, pmuserenr_el0);
> -       write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
> +       update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
>  }
>  
>  static void armv8pmu_enable_event(struct perf_event *event)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-04-06  2:29 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-29  0:21 [PATCH v1 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Reiji Watanabe
2023-03-29  0:21 ` Reiji Watanabe
2023-03-29  0:21 ` [PATCH v1 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0 Reiji Watanabe
2023-03-29  0:21   ` Reiji Watanabe
2023-03-29  7:31   ` Marc Zyngier
2023-03-29  7:31     ` Marc Zyngier
2023-03-29 16:28     ` Reiji Watanabe
2023-03-29 16:28       ` Reiji Watanabe
2023-03-29  0:21 ` [PATCH v1 2/2] KVM: arm64: PMU: Ensure to trap PMU access from EL0 to EL2 Reiji Watanabe
2023-03-29  0:21   ` Reiji Watanabe
2023-03-29 12:03   ` Marc Zyngier
2023-03-29 12:03     ` Marc Zyngier
2023-03-30  3:55     ` Reiji Watanabe
2023-03-30  3:55       ` Reiji Watanabe
2023-04-04 14:25     ` Mark Rutland
2023-04-04 14:25       ` Mark Rutland
2023-04-06  2:28       ` Reiji Watanabe
2023-04-06  2:28         ` Reiji Watanabe
2023-04-04 10:05 ` [PATCH v1 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Marc Zyngier
2023-04-04 10:05   ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.