kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm
@ 2021-09-18 16:17 Russell King (Oracle)
  2021-09-19 13:36 ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Russell King (Oracle) @ 2021-09-18 16:17 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Will Deacon

Hi,

This morning, I upgraded my VM host from Debian Buster to Debian
Bullseye. This host (Armada 8040) runs libvirt. I placed all the VMs
into managedsave, which basically means their state is saved out by
QEMU ready to be resumed once the upgrade is complete.

Initially, I was running 5.11 on the host, which didn't have POSIX
ACLs enabled on tmpfs. Sadly, upgrading to Bullseye now requires
this to be enabled, so I upgraded the kernel to 5.13 to avoid this
problem - without POSIX ACLs on tmpfs, qemu refuses to even start.

Under Bullseye with 5.13, I could start new VMs, but I could not
resume the saved VMs - it would spit out:

2021-09-18T11:14:14.456227Z qemu-system-aarch64: Invalid value 236 expecting positive value <= 162
2021-09-18T11:14:14.456269Z qemu-system-aarch64: Failed to load cpu:cpreg_vmstate_array_len
2021-09-18T11:14:14.456279Z qemu-system-aarch64: error while loading state for instance 0x0 of device 'cpu'
2021-09-18T11:14:14.456887Z qemu-system-aarch64: load of migration failed: Invalid argument

Essentially, what this is complaining about is that the register
list returned by the KVM_GET_REG_LIST ioctl has reduced in size,
meaning that the saved VM has more registers that need to be set
(236 of them, after QEMU's filtering) than those which are actually
available under 5.13 (162 after QEMU's filtering).

After much debugging, and writing a program to create a VM and CPU,
and retrieve the register list etc, I have finally tracked down
exactly what is going on...

Under 5.13, KVM believes there is no PMU, so it doesn't advertise the
PMU registers, despite the hardware having a PMU and the kernel
having support.

kvm_arm_support_pmu_v3() determines whether the PMU_V3 feature is
available or not.

Under 5.11, this was determined via:

	perf_num_counters() > 0

which in turn was derived from (essentially):

	__oprofile_cpu_pmu && __oprofile_cpu_pmu->num_events > 0

__oprofile_cpu_pmu can be set at any time when the PMU has been
initialised, and due to initialisation ordering, it appears to happen
much later in the kernel boot.

However, under 5.13, oprofile has been removed, and this test is no
longer present. Instead, kvm attempts to determine the availability
of PMUv3 when it initialises, and fails to because the PMU has not
yet been initialised. If there is no PMU at the point KVM initialises,
then KVM will never advertise a PMU.

This change of behaviour is what breaks KVM on Armada 8040, causing
a userspace regression.

What makes this more confusing is the PMU errors have gone:

5.13:
[    0.170514] PCI: CLS 0 bytes, default 64
[    0.171085] kvm [1]: IPA Size Limit: 44 bits
[    0.172532] kvm [1]: vgic interrupt IRQ9
[    0.172650] kvm: pmu event creation failed -2
[    0.172688] kvm [1]: Hyp mode initialized successfully
...
[    1.479833] hw perfevents: enabled with armv8_cortex_a72 PMU driver, 7 counters available

5.11:
[    0.138831] PCI: CLS 0 bytes, default 64
[    0.139245] hw perfevents: unable to count PMU IRQs
[    0.139251] hw perfevents: /ap806/config-space@f0000000/pmu: failed to register PMU devices!
[    0.139503] kvm [1]: IPA Size Limit: 44 bits
[    0.140735] kvm [1]: vgic interrupt IRQ9
[    0.140836] kvm [1]: Hyp mode initialized successfully
...
[    1.447789] hw perfevents: enabled with armv8_cortex_a72 PMU driver, 7 counters available

Overall, what this means is that the only way to safely upgrade from
5.11 to 5.13 (I don't know where 5.12 fits in this) is to completely
shut down all your VMs. You can't even migrate them to another
identical machine across these kernels. You have to completely shut
them down.

I did manage to eventually rescue my VMs after many hours, by booting
back into 5.11, and then screwing around with bind mounts to replace
/dev with a filesystem that did have POSIX ACLs enabled, so libvirt
and qemu could then resume the VMs and cleanly shut them down all
down.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm
  2021-09-18 16:17 REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm Russell King (Oracle)
@ 2021-09-19 13:36 ` Marc Zyngier
  2021-09-20  9:45   ` Russell King (Oracle)
  2021-09-20  9:56   ` Will Deacon
  0 siblings, 2 replies; 6+ messages in thread
From: Marc Zyngier @ 2021-09-19 13:36 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: linux-arm-kernel, kvm, kvmarm, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Will Deacon

Hi Russell,

Thanks for the heads-up.

On Sat, 18 Sep 2021 17:17:45 +0100,
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> 
> Hi,
> 
> This morning, I upgraded my VM host from Debian Buster to Debian
> Bullseye. This host (Armada 8040) runs libvirt. I placed all the VMs
> into managedsave, which basically means their state is saved out by
> QEMU ready to be resumed once the upgrade is complete.
> 
> Initially, I was running 5.11 on the host, which didn't have POSIX
> ACLs enabled on tmpfs. Sadly, upgrading to Bullseye now requires
> this to be enabled, so I upgraded the kernel to 5.13 to avoid this
> problem - without POSIX ACLs on tmpfs, qemu refuses to even start.
> 
> Under Bullseye with 5.13, I could start new VMs, but I could not
> resume the saved VMs - it would spit out:
> 
> 2021-09-18T11:14:14.456227Z qemu-system-aarch64: Invalid value 236 expecting positive value <= 162
> 2021-09-18T11:14:14.456269Z qemu-system-aarch64: Failed to load cpu:cpreg_vmstate_array_len
> 2021-09-18T11:14:14.456279Z qemu-system-aarch64: error while loading state for instance 0x0 of device 'cpu'
> 2021-09-18T11:14:14.456887Z qemu-system-aarch64: load of migration failed: Invalid argument
> 
> Essentially, what this is complaining about is that the register
> list returned by the KVM_GET_REG_LIST ioctl has reduced in size,
> meaning that the saved VM has more registers that need to be set
> (236 of them, after QEMU's filtering) than those which are actually
> available under 5.13 (162 after QEMU's filtering).
> 
> After much debugging, and writing a program to create a VM and CPU,
> and retrieve the register list etc, I have finally tracked down
> exactly what is going on...
> 
> Under 5.13, KVM believes there is no PMU, so it doesn't advertise the
> PMU registers, despite the hardware having a PMU and the kernel
> having support.
> 
> kvm_arm_support_pmu_v3() determines whether the PMU_V3 feature is
> available or not.
> 
> Under 5.11, this was determined via:
> 
> 	perf_num_counters() > 0
> 
> which in turn was derived from (essentially):
> 
> 	__oprofile_cpu_pmu && __oprofile_cpu_pmu->num_events > 0
> 
> __oprofile_cpu_pmu can be set at any time when the PMU has been
> initialised, and due to initialisation ordering, it appears to happen
> much later in the kernel boot.
> 
> However, under 5.13, oprofile has been removed, and this test is no
> longer present. Instead, kvm attempts to determine the availability
> of PMUv3 when it initialises, and fails to because the PMU has not
> yet been initialised. If there is no PMU at the point KVM initialises,
> then KVM will never advertise a PMU.
> 
> This change of behaviour is what breaks KVM on Armada 8040, causing
> a userspace regression.
> 
> What makes this more confusing is the PMU errors have gone:
> 
> 5.13:
> [    0.170514] PCI: CLS 0 bytes, default 64
> [    0.171085] kvm [1]: IPA Size Limit: 44 bits
> [    0.172532] kvm [1]: vgic interrupt IRQ9
> [    0.172650] kvm: pmu event creation failed -2
> [    0.172688] kvm [1]: Hyp mode initialized successfully
> ...
> [    1.479833] hw perfevents: enabled with armv8_cortex_a72 PMU driver, 7 counters available
>

[...]

Urgh. That's a bummer. T1he PMU driver only comes up once it has found
its interrupt controller, which on the Armada 8040 is not the GIC, but
some weird thing on the side that doesn't actually serve any real
purpose. On HW where the PMU is directly wired into the GIC, it all
works fine, though by luck rather than by design.

Anyway, rant over. This is a bug that needs addressing so that KVM can
initialise correctly irrespective of the probing order. This probably
means that the static key controlling KVM's behaviour wrt the PMU must
be controlled by the PMU infrastructure itself, rather than KVM trying
to probe for it.

Can you please give the following hack a go (on top of 5.15-rc1)? I've
briefly tested it on my McBin, and it did the trick. I've also tested
it on the M1 (which really doesn't have an architectural PMU) to
verify that it was correctly failing.

Thanks,

	M.

From 9c26e3e6bbcbc3a583b3974e7a9017029d31fe29 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Sun, 19 Sep 2021 14:09:49 +0100
Subject: [PATCH] KVM: arm64: Fix PMU probe ordering

Russell reported that since 5.13, KVM's probing of the PMU has
started to fail on his HW. As it turns out, there is an implicit
ordering dependency between the architectural PMU probing code and
and KVM's own probing. If, due to probe ordering reasons, KVM probes
before the PMU driver, it will fail to detect the PMU and prevent it
from being advertised to guests as well as the VMM.

Obviously, this is one probing too many, and we should be able to
deal with any ordering.

Add a callback from the PMU code into KVM to advertise the registration
of a host CPU PMU, allowing for any probing order.

Fixes: 5421db1be3b1 ("KVM: arm64: Divorce the perf code from oprofile helpers")
Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk
Cc: stable@vger.kernel.org
---
 arch/arm64/kvm/perf.c        |  3 ---
 arch/arm64/kvm/pmu-emul.c    | 12 +++++++++++-
 drivers/perf/arm_pmu.c       |  2 ++
 include/kvm/arm_pmu.h        |  3 ---
 include/linux/perf/arm_pmu.h |  6 ++++++
 5 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/perf.c b/arch/arm64/kvm/perf.c
index f9bb3b14130e..c84fe24b2ea1 100644
--- a/arch/arm64/kvm/perf.c
+++ b/arch/arm64/kvm/perf.c
@@ -50,9 +50,6 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = {
 
 int kvm_perf_init(void)
 {
-	if (kvm_pmu_probe_pmuver() != ID_AA64DFR0_PMUVER_IMP_DEF && !is_protected_kvm_enabled())
-		static_branch_enable(&kvm_arm_pmu_available);
-
 	return perf_register_guest_info_callbacks(&kvm_guest_cbs);
 }
 
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index f5065f23b413..588100c52f34 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -740,7 +740,17 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 	kvm_pmu_create_perf_event(vcpu, select_idx);
 }
 
-int kvm_pmu_probe_pmuver(void)
+void kvm_host_pmu_init(struct arm_pmu *pmu)
+{
+	if (pmu->pmuver != 0 &&
+	    pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
+	    !is_protected_kvm_enabled()) {
+		static_branch_enable(&kvm_arm_pmu_available);
+		kvm_info("PMU detected and enabled\n");
+	}
+}
+
+static int kvm_pmu_probe_pmuver(void)
 {
 	struct perf_event_attr attr = { };
 	struct perf_event *event;
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 3cbc3baf087f..295cc7952d0e 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -952,6 +952,8 @@ int armpmu_register(struct arm_pmu *pmu)
 		pmu->name, pmu->num_events,
 		has_nmi ? ", using NMIs" : "");
 
+	kvm_host_pmu_init(pmu);
+
 	return 0;
 
 out_destroy:
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 864b9997efb2..90f21898aad8 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -61,7 +61,6 @@ int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu,
 			    struct kvm_device_attr *attr);
 int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu);
-int kvm_pmu_probe_pmuver(void);
 #else
 struct kvm_pmu {
 };
@@ -118,8 +117,6 @@ static inline u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
 	return 0;
 }
 
-static inline int kvm_pmu_probe_pmuver(void) { return 0xf; }
-
 #endif
 
 #endif
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 505480217cf1..2512e2f9cd4e 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -163,6 +163,12 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
 static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
 #endif
 
+#ifdef CONFIG_KVM
+void kvm_host_pmu_init(struct arm_pmu *pmu);
+#else
+#define kvm_host_pmu_init(x)	do { } while(0)
+#endif
+
 /* Internal functions only for core arm_pmu code */
 struct arm_pmu *armpmu_alloc(void);
 struct arm_pmu *armpmu_alloc_atomic(void);
-- 
2.30.2


-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm
  2021-09-19 13:36 ` Marc Zyngier
@ 2021-09-20  9:45   ` Russell King (Oracle)
  2021-09-20 14:39     ` Andrew Jones
  2021-09-20  9:56   ` Will Deacon
  1 sibling, 1 reply; 6+ messages in thread
From: Russell King (Oracle) @ 2021-09-20  9:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Will Deacon

On Sun, Sep 19, 2021 at 02:36:46PM +0100, Marc Zyngier wrote:
> Urgh. That's a bummer. T1he PMU driver only comes up once it has found
> its interrupt controller, which on the Armada 8040 is not the GIC, but
> some weird thing on the side that doesn't actually serve any real
> purpose. On HW where the PMU is directly wired into the GIC, it all
> works fine, though by luck rather than by design.
> 
> Anyway, rant over. This is a bug that needs addressing so that KVM can
> initialise correctly irrespective of the probing order. This probably
> means that the static key controlling KVM's behaviour wrt the PMU must
> be controlled by the PMU infrastructure itself, rather than KVM trying
> to probe for it.
> 
> Can you please give the following hack a go (on top of 5.15-rc1)? I've
> briefly tested it on my McBin, and it did the trick. I've also tested
> it on the M1 (which really doesn't have an architectural PMU) to
> verify that it was correctly failing.

My test program that derives the number of registers qemu uses now
reports 236 registers again and I see:

kvm [7]: PMU detected and enabled

in the kernel boot log.

Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm
  2021-09-19 13:36 ` Marc Zyngier
  2021-09-20  9:45   ` Russell King (Oracle)
@ 2021-09-20  9:56   ` Will Deacon
  2021-09-20 10:23     ` Marc Zyngier
  1 sibling, 1 reply; 6+ messages in thread
From: Will Deacon @ 2021-09-20  9:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Russell King (Oracle),
	linux-arm-kernel, kvm, kvmarm, James Morse, Alexandru Elisei,
	Suzuki K Poulose

On Sun, Sep 19, 2021 at 02:36:46PM +0100, Marc Zyngier wrote:
> Hi Russell,
> 
> Thanks for the heads-up.
> 
> On Sat, 18 Sep 2021 17:17:45 +0100,
> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > 
> > Hi,
> > 
> > This morning, I upgraded my VM host from Debian Buster to Debian
> > Bullseye. This host (Armada 8040) runs libvirt. I placed all the VMs
> > into managedsave, which basically means their state is saved out by
> > QEMU ready to be resumed once the upgrade is complete.
> > 
> > Initially, I was running 5.11 on the host, which didn't have POSIX
> > ACLs enabled on tmpfs. Sadly, upgrading to Bullseye now requires
> > this to be enabled, so I upgraded the kernel to 5.13 to avoid this
> > problem - without POSIX ACLs on tmpfs, qemu refuses to even start.
> > 
> > Under Bullseye with 5.13, I could start new VMs, but I could not
> > resume the saved VMs - it would spit out:
> > 
> > 2021-09-18T11:14:14.456227Z qemu-system-aarch64: Invalid value 236 expecting positive value <= 162
> > 2021-09-18T11:14:14.456269Z qemu-system-aarch64: Failed to load cpu:cpreg_vmstate_array_len
> > 2021-09-18T11:14:14.456279Z qemu-system-aarch64: error while loading state for instance 0x0 of device 'cpu'
> > 2021-09-18T11:14:14.456887Z qemu-system-aarch64: load of migration failed: Invalid argument
> > 
> > Essentially, what this is complaining about is that the register
> > list returned by the KVM_GET_REG_LIST ioctl has reduced in size,
> > meaning that the saved VM has more registers that need to be set
> > (236 of them, after QEMU's filtering) than those which are actually
> > available under 5.13 (162 after QEMU's filtering).
> > 
> > After much debugging, and writing a program to create a VM and CPU,
> > and retrieve the register list etc, I have finally tracked down
> > exactly what is going on...
> > 
> > Under 5.13, KVM believes there is no PMU, so it doesn't advertise the
> > PMU registers, despite the hardware having a PMU and the kernel
> > having support.
> > 
> > kvm_arm_support_pmu_v3() determines whether the PMU_V3 feature is
> > available or not.
> > 
> > Under 5.11, this was determined via:
> > 
> > 	perf_num_counters() > 0
> > 
> > which in turn was derived from (essentially):
> > 
> > 	__oprofile_cpu_pmu && __oprofile_cpu_pmu->num_events > 0
> > 
> > __oprofile_cpu_pmu can be set at any time when the PMU has been
> > initialised, and due to initialisation ordering, it appears to happen
> > much later in the kernel boot.
> > 
> > However, under 5.13, oprofile has been removed, and this test is no
> > longer present. Instead, kvm attempts to determine the availability
> > of PMUv3 when it initialises, and fails to because the PMU has not
> > yet been initialised. If there is no PMU at the point KVM initialises,
> > then KVM will never advertise a PMU.
> > 
> > This change of behaviour is what breaks KVM on Armada 8040, causing
> > a userspace regression.
> > 
> > What makes this more confusing is the PMU errors have gone:
> > 
> > 5.13:
> > [    0.170514] PCI: CLS 0 bytes, default 64
> > [    0.171085] kvm [1]: IPA Size Limit: 44 bits
> > [    0.172532] kvm [1]: vgic interrupt IRQ9
> > [    0.172650] kvm: pmu event creation failed -2
> > [    0.172688] kvm [1]: Hyp mode initialized successfully
> > ...
> > [    1.479833] hw perfevents: enabled with armv8_cortex_a72 PMU driver, 7 counters available
> >
> 
> [...]
> 
> Urgh. That's a bummer. T1he PMU driver only comes up once it has found
> its interrupt controller, which on the Armada 8040 is not the GIC, but
> some weird thing on the side that doesn't actually serve any real
> purpose. On HW where the PMU is directly wired into the GIC, it all
> works fine, though by luck rather than by design.
> 
> Anyway, rant over. This is a bug that needs addressing so that KVM can
> initialise correctly irrespective of the probing order. This probably
> means that the static key controlling KVM's behaviour wrt the PMU must
> be controlled by the PMU infrastructure itself, rather than KVM trying
> to probe for it.
> 
> Can you please give the following hack a go (on top of 5.15-rc1)? I've
> briefly tested it on my McBin, and it did the trick. I've also tested
> it on the M1 (which really doesn't have an architectural PMU) to
> verify that it was correctly failing.
> 
> Thanks,
> 
> 	M.
> 
> From 9c26e3e6bbcbc3a583b3974e7a9017029d31fe29 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Sun, 19 Sep 2021 14:09:49 +0100
> Subject: [PATCH] KVM: arm64: Fix PMU probe ordering
> 
> Russell reported that since 5.13, KVM's probing of the PMU has
> started to fail on his HW. As it turns out, there is an implicit
> ordering dependency between the architectural PMU probing code and
> and KVM's own probing. If, due to probe ordering reasons, KVM probes
> before the PMU driver, it will fail to detect the PMU and prevent it
> from being advertised to guests as well as the VMM.
> 
> Obviously, this is one probing too many, and we should be able to
> deal with any ordering.
> 
> Add a callback from the PMU code into KVM to advertise the registration
> of a host CPU PMU, allowing for any probing order.
> 
> Fixes: 5421db1be3b1 ("KVM: arm64: Divorce the perf code from oprofile helpers")
> Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk
> Cc: stable@vger.kernel.org
> ---
>  arch/arm64/kvm/perf.c        |  3 ---
>  arch/arm64/kvm/pmu-emul.c    | 12 +++++++++++-
>  drivers/perf/arm_pmu.c       |  2 ++
>  include/kvm/arm_pmu.h        |  3 ---
>  include/linux/perf/arm_pmu.h |  6 ++++++
>  5 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kvm/perf.c b/arch/arm64/kvm/perf.c
> index f9bb3b14130e..c84fe24b2ea1 100644
> --- a/arch/arm64/kvm/perf.c
> +++ b/arch/arm64/kvm/perf.c
> @@ -50,9 +50,6 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = {
>  
>  int kvm_perf_init(void)
>  {
> -	if (kvm_pmu_probe_pmuver() != ID_AA64DFR0_PMUVER_IMP_DEF && !is_protected_kvm_enabled())
> -		static_branch_enable(&kvm_arm_pmu_available);
> -
>  	return perf_register_guest_info_callbacks(&kvm_guest_cbs);
>  }
>  
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index f5065f23b413..588100c52f34 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -740,7 +740,17 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
>  	kvm_pmu_create_perf_event(vcpu, select_idx);
>  }
>  
> -int kvm_pmu_probe_pmuver(void)
> +void kvm_host_pmu_init(struct arm_pmu *pmu)
> +{
> +	if (pmu->pmuver != 0 &&
> +	    pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> +	    !is_protected_kvm_enabled()) {
> +		static_branch_enable(&kvm_arm_pmu_available);
> +		kvm_info("PMU detected and enabled\n");
> +	}
> +}
> +
> +static int kvm_pmu_probe_pmuver(void)
>  {
>  	struct perf_event_attr attr = { };
>  	struct perf_event *event;
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 3cbc3baf087f..295cc7952d0e 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -952,6 +952,8 @@ int armpmu_register(struct arm_pmu *pmu)
>  		pmu->name, pmu->num_events,
>  		has_nmi ? ", using NMIs" : "");
>  
> +	kvm_host_pmu_init(pmu);

Just a nit, but I think this will get called for each PMU we probe on a
big.LITTLE system which is probably harmless, but possible not what you want?

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm
  2021-09-20  9:56   ` Will Deacon
@ 2021-09-20 10:23     ` Marc Zyngier
  0 siblings, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2021-09-20 10:23 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King (Oracle),
	linux-arm-kernel, kvm, kvmarm, James Morse, Alexandru Elisei,
	Suzuki K Poulose

On Mon, 20 Sep 2021 10:56:57 +0100,
Will Deacon <will@kernel.org> wrote:
> 
> On Sun, Sep 19, 2021 at 02:36:46PM +0100, Marc Zyngier wrote:
> > From 9c26e3e6bbcbc3a583b3974e7a9017029d31fe29 Mon Sep 17 00:00:00 2001
> > From: Marc Zyngier <maz@kernel.org>
> > Date: Sun, 19 Sep 2021 14:09:49 +0100
> > Subject: [PATCH] KVM: arm64: Fix PMU probe ordering
> > 
> > Russell reported that since 5.13, KVM's probing of the PMU has
> > started to fail on his HW. As it turns out, there is an implicit
> > ordering dependency between the architectural PMU probing code and
> > and KVM's own probing. If, due to probe ordering reasons, KVM probes
> > before the PMU driver, it will fail to detect the PMU and prevent it
> > from being advertised to guests as well as the VMM.
> > 
> > Obviously, this is one probing too many, and we should be able to
> > deal with any ordering.
> > 
> > Add a callback from the PMU code into KVM to advertise the registration
> > of a host CPU PMU, allowing for any probing order.
> > 
> > Fixes: 5421db1be3b1 ("KVM: arm64: Divorce the perf code from oprofile helpers")
> > Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk
> > Cc: stable@vger.kernel.org
> > ---
> >  arch/arm64/kvm/perf.c        |  3 ---
> >  arch/arm64/kvm/pmu-emul.c    | 12 +++++++++++-
> >  drivers/perf/arm_pmu.c       |  2 ++
> >  include/kvm/arm_pmu.h        |  3 ---
> >  include/linux/perf/arm_pmu.h |  6 ++++++
> >  5 files changed, 19 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/perf.c b/arch/arm64/kvm/perf.c
> > index f9bb3b14130e..c84fe24b2ea1 100644
> > --- a/arch/arm64/kvm/perf.c
> > +++ b/arch/arm64/kvm/perf.c
> > @@ -50,9 +50,6 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = {
> >  
> >  int kvm_perf_init(void)
> >  {
> > -	if (kvm_pmu_probe_pmuver() != ID_AA64DFR0_PMUVER_IMP_DEF && !is_protected_kvm_enabled())
> > -		static_branch_enable(&kvm_arm_pmu_available);
> > -
> >  	return perf_register_guest_info_callbacks(&kvm_guest_cbs);
> >  }
> >  
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index f5065f23b413..588100c52f34 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -740,7 +740,17 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
> >  	kvm_pmu_create_perf_event(vcpu, select_idx);
> >  }
> >  
> > -int kvm_pmu_probe_pmuver(void)
> > +void kvm_host_pmu_init(struct arm_pmu *pmu)
> > +{
> > +	if (pmu->pmuver != 0 &&
> > +	    pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> > +	    !is_protected_kvm_enabled()) {
> > +		static_branch_enable(&kvm_arm_pmu_available);
> > +		kvm_info("PMU detected and enabled\n");
> > +	}
> > +}
> > +
> > +static int kvm_pmu_probe_pmuver(void)
> >  {
> >  	struct perf_event_attr attr = { };
> >  	struct perf_event *event;
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 3cbc3baf087f..295cc7952d0e 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -952,6 +952,8 @@ int armpmu_register(struct arm_pmu *pmu)
> >  		pmu->name, pmu->num_events,
> >  		has_nmi ? ", using NMIs" : "");
> >  
> > +	kvm_host_pmu_init(pmu);
> 
> Just a nit, but I think this will get called for each PMU we probe
> on a big.LITTLE system which is probably harmless, but possible not
> what you want?

Yeah, it is a bit ugly, but harmless. In the future, it would be
useful to track which PMU is used on which CPUs, and this will give us
a decent hook.

I'll tone the print down though.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm
  2021-09-20  9:45   ` Russell King (Oracle)
@ 2021-09-20 14:39     ` Andrew Jones
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Jones @ 2021-09-20 14:39 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Marc Zyngier, kvm, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Sep 20, 2021 at 10:45:26AM +0100, Russell King (Oracle) wrote:
> On Sun, Sep 19, 2021 at 02:36:46PM +0100, Marc Zyngier wrote:
> > Urgh. That's a bummer. T1he PMU driver only comes up once it has found
> > its interrupt controller, which on the Armada 8040 is not the GIC, but
> > some weird thing on the side that doesn't actually serve any real
> > purpose. On HW where the PMU is directly wired into the GIC, it all
> > works fine, though by luck rather than by design.
> > 
> > Anyway, rant over. This is a bug that needs addressing so that KVM can
> > initialise correctly irrespective of the probing order. This probably
> > means that the static key controlling KVM's behaviour wrt the PMU must
> > be controlled by the PMU infrastructure itself, rather than KVM trying
> > to probe for it.
> > 
> > Can you please give the following hack a go (on top of 5.15-rc1)? I've
> > briefly tested it on my McBin, and it did the trick. I've also tested
> > it on the M1 (which really doesn't have an architectural PMU) to
> > verify that it was correctly failing.
> 
> My test program that derives the number of registers qemu uses now
> reports 236 registers again and I see:

Hi Russell,

You may be interested in kvm selftests and this one[1] in particular.

[1] tools/testing/selftests/kvm/aarch64/get-reg-list.c

Thanks,
drew


> 
> kvm [7]: PMU detected and enabled
> 
> in the kernel boot log.
> 
> Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> 
> Thanks.
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-20 14:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-18 16:17 REGRESSION: Upgrading host kernel from 5.11 to 5.13 breaks QEMU guests - perf/fw_devlink/kvm Russell King (Oracle)
2021-09-19 13:36 ` Marc Zyngier
2021-09-20  9:45   ` Russell King (Oracle)
2021-09-20 14:39     ` Andrew Jones
2021-09-20  9:56   ` Will Deacon
2021-09-20 10:23     ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).