All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-13 15:23 ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

(CC'ing Peter Maydell in case this might be of interest to qemu)

The series can be found on a branch at [1], and the kvmtool support at [2].
The kvmtool patches are also on the mailing list [3] and haven't changed
since v1.

Detailed explanation of the issue and symptoms that the patches attempt to
correct can be found in the cover letter for v1 [4].

A summary of the problem is that on heterogeneous systems KVM will always
use the same PMU for creating the VCPU events for *all* VCPUs regardless of
the physical CPU on which the VCPU is running, leading to events suddenly
stopping and resuming in the guest as the VCPU thread gets migrated across
different CPUs.

This series proposes to fix this behaviour by allowing the user to specify
which physical PMU is used when creating the VCPU events needed for guest
PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
physical which is not part of the supported CPUs for the specified PMU. The
restriction is that all VCPUs must use the same PMU to avoid emulating an
asymmetric platform.

The default behaviour stays the same - without userspace setting the PMU,
events will stop counting if the VCPU is scheduled on the wrong CPU.

Tested with a hacked version of kvmtool that does the PMU initialization
from the VCPU thread as opposed to from the main thread. Tested on
rockpro64 by testing what happens when all VCPUs having the same PMU, one
random VCPU having a different PMU than the other VCPUs and one random VCPU
not having the PMU set (each test was run 1,000 times on the little cores
and 1,000 times on the big cores).

Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
not having a PMU set, and one random VCPU not having the PMU set; the VM
had 64 threads in each of the tests and each test was run 10,000 times.

Changes since v2 [5]:

- Rebased on top of v5.16-rc5
- Check that all VCPUs have the same PMU set (or none at all).
- Use the VCPU's PMUVer value when calculating the event mask, if a PMU is
  set for that VCPU.
- Clear the unsupported CPU flag in vcpu_put().
- Move the handling of the unsupported CPU flag in kvm_vcpu_exit_request().
- Free the cpumask of supported CPUs if kvm_arch_vcpu_create() fails.

Changes since v1 [4]:

- Rebased on top of v5.16-rc4
- Implemented review comments: protect iterating through the list of PMUs
  with a mutex, documentation changes, initialize vcpu-arch.supported_cpus
  to cpu_possible_mask, changed vcpu->arch.cpu_not_supported to a VCPU
  flag, set exit reason to KVM_EXIT_FAIL_ENTRY and populate fail_entry when
  the VCPU is run on a CPU not in the PMU's supported cpumask. Many thanks
  for the review!

[1] https://gitlab.arm.com/linux-arm/linux-ae/-/tree/pmu-big-little-fix-v3
[2] https://gitlab.arm.com/linux-arm/kvmtool-ae/-/tree/pmu-big-little-fix-v1
[3] https://www.spinics.net/lists/arm-kernel/msg933584.html
[4] https://www.spinics.net/lists/arm-kernel/msg933579.html
[5] https://www.spinics.net/lists/kvm-arm/msg50944.html


Alexandru Elisei (4):
  perf: Fix wrong name in comment for struct perf_cpu_context
  KVM: arm64: Keep a list of probed PMUs
  KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical
    CPU

 Documentation/virt/kvm/devices/vcpu.rst |  34 ++++++-
 arch/arm64/include/asm/kvm_host.h       |  12 +++
 arch/arm64/include/uapi/asm/kvm.h       |   4 +
 arch/arm64/kvm/arm.c                    |  29 +++++-
 arch/arm64/kvm/pmu-emul.c               | 114 ++++++++++++++++++++----
 include/kvm/arm_pmu.h                   |   9 +-
 include/linux/perf_event.h              |   2 +-
 tools/arch/arm64/include/uapi/asm/kvm.h |   1 +
 8 files changed, 180 insertions(+), 25 deletions(-)

-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-13 15:23 ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo, peter.maydell

(CC'ing Peter Maydell in case this might be of interest to qemu)

The series can be found on a branch at [1], and the kvmtool support at [2].
The kvmtool patches are also on the mailing list [3] and haven't changed
since v1.

Detailed explanation of the issue and symptoms that the patches attempt to
correct can be found in the cover letter for v1 [4].

A summary of the problem is that on heterogeneous systems KVM will always
use the same PMU for creating the VCPU events for *all* VCPUs regardless of
the physical CPU on which the VCPU is running, leading to events suddenly
stopping and resuming in the guest as the VCPU thread gets migrated across
different CPUs.

This series proposes to fix this behaviour by allowing the user to specify
which physical PMU is used when creating the VCPU events needed for guest
PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
physical which is not part of the supported CPUs for the specified PMU. The
restriction is that all VCPUs must use the same PMU to avoid emulating an
asymmetric platform.

The default behaviour stays the same - without userspace setting the PMU,
events will stop counting if the VCPU is scheduled on the wrong CPU.

Tested with a hacked version of kvmtool that does the PMU initialization
from the VCPU thread as opposed to from the main thread. Tested on
rockpro64 by testing what happens when all VCPUs having the same PMU, one
random VCPU having a different PMU than the other VCPUs and one random VCPU
not having the PMU set (each test was run 1,000 times on the little cores
and 1,000 times on the big cores).

Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
not having a PMU set, and one random VCPU not having the PMU set; the VM
had 64 threads in each of the tests and each test was run 10,000 times.

Changes since v2 [5]:

- Rebased on top of v5.16-rc5
- Check that all VCPUs have the same PMU set (or none at all).
- Use the VCPU's PMUVer value when calculating the event mask, if a PMU is
  set for that VCPU.
- Clear the unsupported CPU flag in vcpu_put().
- Move the handling of the unsupported CPU flag in kvm_vcpu_exit_request().
- Free the cpumask of supported CPUs if kvm_arch_vcpu_create() fails.

Changes since v1 [4]:

- Rebased on top of v5.16-rc4
- Implemented review comments: protect iterating through the list of PMUs
  with a mutex, documentation changes, initialize vcpu-arch.supported_cpus
  to cpu_possible_mask, changed vcpu->arch.cpu_not_supported to a VCPU
  flag, set exit reason to KVM_EXIT_FAIL_ENTRY and populate fail_entry when
  the VCPU is run on a CPU not in the PMU's supported cpumask. Many thanks
  for the review!

[1] https://gitlab.arm.com/linux-arm/linux-ae/-/tree/pmu-big-little-fix-v3
[2] https://gitlab.arm.com/linux-arm/kvmtool-ae/-/tree/pmu-big-little-fix-v1
[3] https://www.spinics.net/lists/arm-kernel/msg933584.html
[4] https://www.spinics.net/lists/arm-kernel/msg933579.html
[5] https://www.spinics.net/lists/kvm-arm/msg50944.html


Alexandru Elisei (4):
  perf: Fix wrong name in comment for struct perf_cpu_context
  KVM: arm64: Keep a list of probed PMUs
  KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical
    CPU

 Documentation/virt/kvm/devices/vcpu.rst |  34 ++++++-
 arch/arm64/include/asm/kvm_host.h       |  12 +++
 arch/arm64/include/uapi/asm/kvm.h       |   4 +
 arch/arm64/kvm/arm.c                    |  29 +++++-
 arch/arm64/kvm/pmu-emul.c               | 114 ++++++++++++++++++++----
 include/kvm/arm_pmu.h                   |   9 +-
 include/linux/perf_event.h              |   2 +-
 tools/arch/arm64/include/uapi/asm/kvm.h |   1 +
 8 files changed, 180 insertions(+), 25 deletions(-)

-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 1/4] perf: Fix wrong name in comment for struct perf_cpu_context
  2021-12-13 15:23 ` Alexandru Elisei
@ 2021-12-13 15:23   ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

Commit 0793a61d4df8 ("performance counters: core code") added the perf
subsystem (then called Performance Counters) to Linux, creating the struct
perf_cpu_context. The comment for the struct referred to it as a "struct
perf_counter_cpu_context".

Commit cdd6c482c9ff ("perf: Do the big rename: Performance Counters ->
Performance Events") changed the comment to refer to a "struct
perf_event_cpu_context", which was still the wrong name for the struct.

Change the comment to say "struct perf_cpu_context".

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 include/linux/perf_event.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0dcfd265beed..14132570ea5d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -862,7 +862,7 @@ struct perf_event_context {
 #define PERF_NR_CONTEXTS	4
 
 /**
- * struct perf_event_cpu_context - per cpu event context structure
+ * struct perf_cpu_context - per cpu event context structure
  */
 struct perf_cpu_context {
 	struct perf_event_context	ctx;
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 1/4] perf: Fix wrong name in comment for struct perf_cpu_context
@ 2021-12-13 15:23   ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo, peter.maydell

Commit 0793a61d4df8 ("performance counters: core code") added the perf
subsystem (then called Performance Counters) to Linux, creating the struct
perf_cpu_context. The comment for the struct referred to it as a "struct
perf_counter_cpu_context".

Commit cdd6c482c9ff ("perf: Do the big rename: Performance Counters ->
Performance Events") changed the comment to refer to a "struct
perf_event_cpu_context", which was still the wrong name for the struct.

Change the comment to say "struct perf_cpu_context".

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 include/linux/perf_event.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0dcfd265beed..14132570ea5d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -862,7 +862,7 @@ struct perf_event_context {
 #define PERF_NR_CONTEXTS	4
 
 /**
- * struct perf_event_cpu_context - per cpu event context structure
+ * struct perf_cpu_context - per cpu event context structure
  */
 struct perf_cpu_context {
 	struct perf_event_context	ctx;
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
  2021-12-13 15:23 ` Alexandru Elisei
@ 2021-12-13 15:23   ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
 include/kvm/arm_pmu.h     |  5 +++++
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index a5e4bbf5e68f..eb4be96f144d 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -7,6 +7,7 @@
 #include <linux/cpu.h>
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
+#include <linux/list.h>
 #include <linux/perf_event.h>
 #include <linux/perf/arm_pmu.h>
 #include <linux/uaccess.h>
@@ -14,6 +15,9 @@
 #include <kvm/arm_pmu.h>
 #include <kvm/arm_vgic.h>
 
+static LIST_HEAD(arm_pmus);
+static DEFINE_MUTEX(arm_pmus_lock);
+
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
@@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 
 void kvm_host_pmu_init(struct arm_pmu *pmu)
 {
-	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
-	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
+	struct arm_pmu_entry *entry;
+
+	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
+	    is_protected_kvm_enabled())
+		return;
+
+	mutex_lock(&arm_pmus_lock);
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		goto out_unlock;
+
+	if (list_empty(&arm_pmus))
 		static_branch_enable(&kvm_arm_pmu_available);
+
+	entry->arm_pmu = pmu;
+	list_add_tail(&entry->entry, &arm_pmus);
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
 }
 
 static int kvm_pmu_probe_pmuver(void)
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 90f21898aad8..e249c5f172aa 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -36,6 +36,11 @@ struct kvm_pmu {
 	struct irq_work overflow_work;
 };
 
+struct arm_pmu_entry {
+	struct list_head entry;
+	struct arm_pmu *arm_pmu;
+};
+
 #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
 u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
 void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
@ 2021-12-13 15:23   ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo, peter.maydell

The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
 include/kvm/arm_pmu.h     |  5 +++++
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index a5e4bbf5e68f..eb4be96f144d 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -7,6 +7,7 @@
 #include <linux/cpu.h>
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
+#include <linux/list.h>
 #include <linux/perf_event.h>
 #include <linux/perf/arm_pmu.h>
 #include <linux/uaccess.h>
@@ -14,6 +15,9 @@
 #include <kvm/arm_pmu.h>
 #include <kvm/arm_vgic.h>
 
+static LIST_HEAD(arm_pmus);
+static DEFINE_MUTEX(arm_pmus_lock);
+
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
@@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 
 void kvm_host_pmu_init(struct arm_pmu *pmu)
 {
-	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
-	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
+	struct arm_pmu_entry *entry;
+
+	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
+	    is_protected_kvm_enabled())
+		return;
+
+	mutex_lock(&arm_pmus_lock);
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		goto out_unlock;
+
+	if (list_empty(&arm_pmus))
 		static_branch_enable(&kvm_arm_pmu_available);
+
+	entry->arm_pmu = pmu;
+	list_add_tail(&entry->entry, &arm_pmus);
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
 }
 
 static int kvm_pmu_probe_pmuver(void)
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 90f21898aad8..e249c5f172aa 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -36,6 +36,11 @@ struct kvm_pmu {
 	struct irq_work overflow_work;
 };
 
+struct arm_pmu_entry {
+	struct list_head entry;
+	struct arm_pmu *arm_pmu;
+};
+
 #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
 u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
 void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-13 15:23 ` Alexandru Elisei
@ 2021-12-13 15:23   ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

When KVM creates an event and there are more than one PMUs present on the
system, perf_init_event() will go through the list of available PMUs and
will choose the first one that can create the event. The order of the PMUs
in the PMU list depends on the probe order, which can change under various
circumstances, for example if the order of the PMU nodes change in the DTB
or if asynchronous driver probing is enabled on the kernel command line
(with the driver_async_probe=armv8-pmu option).

Another consequence of this approach is that, on heteregeneous systems,
all virtual machines that KVM creates will use the same PMU. This might
cause unexpected behaviour for userspace: when a VCPU is executing on
the physical CPU that uses this PMU, PMU events in the guest work
correctly; but when the same VCPU executes on another CPU, PMU events in
the guest will suddenly stop counting.

Fortunately, perf core allows user to specify on which PMU to create an
event by using the perf_event_attr->type field, which is used by
perf_init_event() as an index in the radix tree of available PMUs.

Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
attribute to allow userspace to specify the arm_pmu that KVM will use when
creating events for that VCPU. KVM will make no attempt to run the VCPU on
the physical CPUs that share this PMU, leaving it up to userspace to
manage the VCPU threads' affinity accordingly.

Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
asymmetric system to the guest: either all VCPUs have the same PMU, either
none of the VCPUs have a PMU set. Attempting to do something in between
will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---

Checking that all VCPUs have the same PMU is done when the PMU is
initialized because setting the VCPU PMU is optional, and KVM cannot know
what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
changed to an atomic variable because changes to the VCPU PMU state now
need to be observable by all physical CPUs.

 Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
 arch/arm64/include/uapi/asm/kvm.h       |  1 +
 arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
 include/kvm/arm_pmu.h                   |  4 +-
 tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
 5 files changed, 104 insertions(+), 20 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 60a29972d3f1..b918669bf925 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -49,8 +49,8 @@ Returns:
 	 =======  ======================================================
 	 -EEXIST  Interrupt number already used
 	 -ENODEV  PMUv3 not supported or GIC not initialized
-	 -ENXIO   PMUv3 not supported, missing VCPU feature or interrupt
-		  number not set
+	 -ENXIO   PMUv3 not supported, missing VCPU feature, interrupt
+		  number not set or mismatched PMUs set
 	 -EBUSY   PMUv3 already initialized
 	 =======  ======================================================
 
@@ -104,6 +104,32 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
 isn't strictly speaking an event. Filtering the cycle counter is possible
 using event 0x11 (CPU_CYCLES).
 
+1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
+------------------------------------------
+
+:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
+             identifier.
+
+:Returns:
+
+	 =======  ===============================================
+	 -EBUSY   PMUv3 already initialized
+	 -EFAULT  Error accessing the PMU identifier
+	 -ENXIO   PMU not found
+	 -ENODEV  PMUv3 not supported or GIC not initialized
+	 -ENOMEM  Could not allocate memory
+	 =======  ===============================================
+
+Request that the VCPU uses the specified hardware PMU when creating guest events
+for the purpose of PMU emulation. The PMU identifier can be read from the "type"
+file for the desired PMU instance under /sys/devices (or, equivalent,
+/sys/bus/even_source). This attribute is particularly useful on heterogeneous
+systems where there are at least two CPU PMUs on the system. All VCPUs must have
+the same PMU, otherwise KVM_ARM_VCPU_PMU_V3_INIT will fail.
+
+Note that KVM will not make any attempts to run the VCPU on the physical CPUs
+associated with the PMU specified by this attribute. This is entirely left to
+userspace.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index eb4be96f144d..8de38d7fa493 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -24,9 +24,16 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
 
 #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
 
-static u32 kvm_pmu_event_mask(struct kvm *kvm)
+static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
 {
-	switch (kvm->arch.pmuver) {
+	unsigned int pmuver;
+
+	if (vcpu->arch.pmu.arm_pmu)
+		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
+	else
+		pmuver = vcpu->kvm->arch.pmuver;
+
+	switch (pmuver) {
 	case ID_AA64DFR0_PMUVER_8_0:
 		return GENMASK(9, 0);
 	case ID_AA64DFR0_PMUVER_8_1:
@@ -34,7 +41,7 @@ static u32 kvm_pmu_event_mask(struct kvm *kvm)
 	case ID_AA64DFR0_PMUVER_8_5:
 		return GENMASK(15, 0);
 	default:		/* Shouldn't be here, just for sanity */
-		WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
+		WARN_ONCE(1, "Unknown PMU version %d\n", pmuver);
 		return 0;
 	}
 }
@@ -119,7 +126,7 @@ static bool kvm_pmu_idx_has_chain_evtype(struct kvm_vcpu *vcpu, u64 select_idx)
 		return false;
 
 	reg = PMEVTYPER0_EL0 + select_idx;
-	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu->kvm);
+	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu);
 
 	return eventsel == ARMV8_PMUV3_PERFCTR_CHAIN;
 }
@@ -534,7 +541,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 
 		/* PMSWINC only applies to ... SW_INC! */
 		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
-		type &= kvm_pmu_event_mask(vcpu->kvm);
+		type &= kvm_pmu_event_mask(vcpu);
 		if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
 			continue;
 
@@ -602,6 +609,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 {
 	struct kvm_pmu *pmu = &vcpu->arch.pmu;
+	struct arm_pmu *arm_pmu = pmu->arm_pmu;
 	struct kvm_pmc *pmc;
 	struct perf_event *event;
 	struct perf_event_attr attr;
@@ -622,7 +630,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 	if (pmc->idx == ARMV8_PMU_CYCLE_IDX)
 		eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES;
 	else
-		eventsel = data & kvm_pmu_event_mask(vcpu->kvm);
+		eventsel = data & kvm_pmu_event_mask(vcpu);
 
 	/* Software increment event doesn't need to be backed by a perf event */
 	if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR)
@@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 		return;
 
 	memset(&attr, 0, sizeof(struct perf_event_attr));
-	attr.type = PERF_TYPE_RAW;
-	attr.size = sizeof(attr);
+	attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
 	attr.pinned = 1;
 	attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
 	attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
@@ -733,7 +740,7 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 
 	mask  =  ARMV8_PMU_EVTYPE_MASK;
 	mask &= ~ARMV8_PMU_EVTYPE_EVENT;
-	mask |= kvm_pmu_event_mask(vcpu->kvm);
+	mask |= kvm_pmu_event_mask(vcpu);
 
 	reg = (select_idx == ARMV8_PMU_CYCLE_IDX)
 	      ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + select_idx;
@@ -836,7 +843,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
 	if (!bmap)
 		return val;
 
-	nr_events = kvm_pmu_event_mask(vcpu->kvm) + 1;
+	nr_events = kvm_pmu_event_mask(vcpu) + 1;
 
 	for (i = 0; i < 32; i += 8) {
 		u64 byte;
@@ -857,7 +864,7 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
 	if (!kvm_vcpu_has_pmu(vcpu))
 		return 0;
 
-	if (!vcpu->arch.pmu.created)
+	if (!atomic_read(&vcpu->arch.pmu.created))
 		return -EINVAL;
 
 	/*
@@ -887,15 +894,20 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
 
 static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
 {
-	if (irqchip_in_kernel(vcpu->kvm)) {
-		int ret;
+	struct arm_pmu *arm_pmu = vcpu->arch.pmu.arm_pmu;
+	struct kvm *kvm = vcpu->kvm;
+	struct kvm_vcpu *v;
+	int ret = 0;
+	int i;
+
+	if (irqchip_in_kernel(kvm)) {
 
 		/*
 		 * If using the PMU with an in-kernel virtual GIC
 		 * implementation, we require the GIC to be already
 		 * initialized when initializing the PMU.
 		 */
-		if (!vgic_initialized(vcpu->kvm))
+		if (!vgic_initialized(kvm))
 			return -ENODEV;
 
 		if (!kvm_arm_pmu_irq_initialized(vcpu))
@@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
 	init_irq_work(&vcpu->arch.pmu.overflow_work,
 		      kvm_pmu_perf_overflow_notify_vcpu);
 
-	vcpu->arch.pmu.created = true;
+	atomic_set(&vcpu->arch.pmu.created, 1);
+
+	kvm_for_each_vcpu(i, v, kvm) {
+		if (!atomic_read(&v->arch.pmu.created))
+			continue;
+
+		if (v->arch.pmu.arm_pmu != arm_pmu)
+			return -ENXIO;
+	}
+
 	return 0;
 }
 
@@ -940,12 +961,35 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
 	return true;
 }
 
+static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
+{
+	struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
+	struct arm_pmu_entry *entry;
+	struct arm_pmu *arm_pmu;
+	int ret = -ENXIO;
+
+	mutex_lock(&arm_pmus_lock);
+
+	list_for_each_entry(entry, &arm_pmus, entry) {
+		arm_pmu = entry->arm_pmu;
+		if (arm_pmu->pmu.type == pmu_id) {
+			kvm_pmu->arm_pmu = arm_pmu;
+			ret = 0;
+			goto out_unlock;
+		}
+	}
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
+	return ret;
+}
+
 int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 {
 	if (!kvm_vcpu_has_pmu(vcpu))
 		return -ENODEV;
 
-	if (vcpu->arch.pmu.created)
+	if (atomic_read(&vcpu->arch.pmu.created))
 		return -EBUSY;
 
 	if (!vcpu->kvm->arch.pmuver)
@@ -984,7 +1028,7 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 		struct kvm_pmu_event_filter filter;
 		int nr_events;
 
-		nr_events = kvm_pmu_event_mask(vcpu->kvm) + 1;
+		nr_events = kvm_pmu_event_mask(vcpu) + 1;
 
 		uaddr = (struct kvm_pmu_event_filter __user *)(long)attr->addr;
 
@@ -1026,6 +1070,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 
 		return 0;
 	}
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
+		int __user *uaddr = (int __user *)(long)attr->addr;
+		int pmu_id;
+
+		if (get_user(pmu_id, uaddr))
+			return -EFAULT;
+
+		return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
+	}
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 		return kvm_arm_pmu_v3_init(vcpu);
 	}
@@ -1063,6 +1116,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 	case KVM_ARM_VCPU_PMU_V3_IRQ:
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 	case KVM_ARM_VCPU_PMU_V3_FILTER:
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU:
 		if (kvm_vcpu_has_pmu(vcpu))
 			return 0;
 	}
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index e249c5f172aa..892728f85b25 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -7,6 +7,7 @@
 #ifndef __ASM_ARM_KVM_PMU_H
 #define __ASM_ARM_KVM_PMU_H
 
+#include <linux/atomic.h>
 #include <linux/perf_event.h>
 #include <asm/perf_event.h>
 
@@ -31,9 +32,10 @@ struct kvm_pmu {
 	int irq_num;
 	struct kvm_pmc pmc[ARMV8_PMU_MAX_COUNTERS];
 	DECLARE_BITMAP(chained, ARMV8_PMU_MAX_COUNTER_PAIRS);
-	bool created;
+	atomic_t created;
 	bool irq_level;
 	struct irq_work overflow_work;
+	struct arm_pmu *arm_pmu;
 };
 
 struct arm_pmu_entry {
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-13 15:23   ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo, peter.maydell

When KVM creates an event and there are more than one PMUs present on the
system, perf_init_event() will go through the list of available PMUs and
will choose the first one that can create the event. The order of the PMUs
in the PMU list depends on the probe order, which can change under various
circumstances, for example if the order of the PMU nodes change in the DTB
or if asynchronous driver probing is enabled on the kernel command line
(with the driver_async_probe=armv8-pmu option).

Another consequence of this approach is that, on heteregeneous systems,
all virtual machines that KVM creates will use the same PMU. This might
cause unexpected behaviour for userspace: when a VCPU is executing on
the physical CPU that uses this PMU, PMU events in the guest work
correctly; but when the same VCPU executes on another CPU, PMU events in
the guest will suddenly stop counting.

Fortunately, perf core allows user to specify on which PMU to create an
event by using the perf_event_attr->type field, which is used by
perf_init_event() as an index in the radix tree of available PMUs.

Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
attribute to allow userspace to specify the arm_pmu that KVM will use when
creating events for that VCPU. KVM will make no attempt to run the VCPU on
the physical CPUs that share this PMU, leaving it up to userspace to
manage the VCPU threads' affinity accordingly.

Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
asymmetric system to the guest: either all VCPUs have the same PMU, either
none of the VCPUs have a PMU set. Attempting to do something in between
will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---

Checking that all VCPUs have the same PMU is done when the PMU is
initialized because setting the VCPU PMU is optional, and KVM cannot know
what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
changed to an atomic variable because changes to the VCPU PMU state now
need to be observable by all physical CPUs.

 Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
 arch/arm64/include/uapi/asm/kvm.h       |  1 +
 arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
 include/kvm/arm_pmu.h                   |  4 +-
 tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
 5 files changed, 104 insertions(+), 20 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 60a29972d3f1..b918669bf925 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -49,8 +49,8 @@ Returns:
 	 =======  ======================================================
 	 -EEXIST  Interrupt number already used
 	 -ENODEV  PMUv3 not supported or GIC not initialized
-	 -ENXIO   PMUv3 not supported, missing VCPU feature or interrupt
-		  number not set
+	 -ENXIO   PMUv3 not supported, missing VCPU feature, interrupt
+		  number not set or mismatched PMUs set
 	 -EBUSY   PMUv3 already initialized
 	 =======  ======================================================
 
@@ -104,6 +104,32 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
 isn't strictly speaking an event. Filtering the cycle counter is possible
 using event 0x11 (CPU_CYCLES).
 
+1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
+------------------------------------------
+
+:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
+             identifier.
+
+:Returns:
+
+	 =======  ===============================================
+	 -EBUSY   PMUv3 already initialized
+	 -EFAULT  Error accessing the PMU identifier
+	 -ENXIO   PMU not found
+	 -ENODEV  PMUv3 not supported or GIC not initialized
+	 -ENOMEM  Could not allocate memory
+	 =======  ===============================================
+
+Request that the VCPU uses the specified hardware PMU when creating guest events
+for the purpose of PMU emulation. The PMU identifier can be read from the "type"
+file for the desired PMU instance under /sys/devices (or, equivalent,
+/sys/bus/even_source). This attribute is particularly useful on heterogeneous
+systems where there are at least two CPU PMUs on the system. All VCPUs must have
+the same PMU, otherwise KVM_ARM_VCPU_PMU_V3_INIT will fail.
+
+Note that KVM will not make any attempts to run the VCPU on the physical CPUs
+associated with the PMU specified by this attribute. This is entirely left to
+userspace.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index eb4be96f144d..8de38d7fa493 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -24,9 +24,16 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
 
 #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
 
-static u32 kvm_pmu_event_mask(struct kvm *kvm)
+static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
 {
-	switch (kvm->arch.pmuver) {
+	unsigned int pmuver;
+
+	if (vcpu->arch.pmu.arm_pmu)
+		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
+	else
+		pmuver = vcpu->kvm->arch.pmuver;
+
+	switch (pmuver) {
 	case ID_AA64DFR0_PMUVER_8_0:
 		return GENMASK(9, 0);
 	case ID_AA64DFR0_PMUVER_8_1:
@@ -34,7 +41,7 @@ static u32 kvm_pmu_event_mask(struct kvm *kvm)
 	case ID_AA64DFR0_PMUVER_8_5:
 		return GENMASK(15, 0);
 	default:		/* Shouldn't be here, just for sanity */
-		WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
+		WARN_ONCE(1, "Unknown PMU version %d\n", pmuver);
 		return 0;
 	}
 }
@@ -119,7 +126,7 @@ static bool kvm_pmu_idx_has_chain_evtype(struct kvm_vcpu *vcpu, u64 select_idx)
 		return false;
 
 	reg = PMEVTYPER0_EL0 + select_idx;
-	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu->kvm);
+	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu);
 
 	return eventsel == ARMV8_PMUV3_PERFCTR_CHAIN;
 }
@@ -534,7 +541,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 
 		/* PMSWINC only applies to ... SW_INC! */
 		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
-		type &= kvm_pmu_event_mask(vcpu->kvm);
+		type &= kvm_pmu_event_mask(vcpu);
 		if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
 			continue;
 
@@ -602,6 +609,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 {
 	struct kvm_pmu *pmu = &vcpu->arch.pmu;
+	struct arm_pmu *arm_pmu = pmu->arm_pmu;
 	struct kvm_pmc *pmc;
 	struct perf_event *event;
 	struct perf_event_attr attr;
@@ -622,7 +630,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 	if (pmc->idx == ARMV8_PMU_CYCLE_IDX)
 		eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES;
 	else
-		eventsel = data & kvm_pmu_event_mask(vcpu->kvm);
+		eventsel = data & kvm_pmu_event_mask(vcpu);
 
 	/* Software increment event doesn't need to be backed by a perf event */
 	if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR)
@@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 		return;
 
 	memset(&attr, 0, sizeof(struct perf_event_attr));
-	attr.type = PERF_TYPE_RAW;
-	attr.size = sizeof(attr);
+	attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
 	attr.pinned = 1;
 	attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
 	attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
@@ -733,7 +740,7 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 
 	mask  =  ARMV8_PMU_EVTYPE_MASK;
 	mask &= ~ARMV8_PMU_EVTYPE_EVENT;
-	mask |= kvm_pmu_event_mask(vcpu->kvm);
+	mask |= kvm_pmu_event_mask(vcpu);
 
 	reg = (select_idx == ARMV8_PMU_CYCLE_IDX)
 	      ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + select_idx;
@@ -836,7 +843,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
 	if (!bmap)
 		return val;
 
-	nr_events = kvm_pmu_event_mask(vcpu->kvm) + 1;
+	nr_events = kvm_pmu_event_mask(vcpu) + 1;
 
 	for (i = 0; i < 32; i += 8) {
 		u64 byte;
@@ -857,7 +864,7 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
 	if (!kvm_vcpu_has_pmu(vcpu))
 		return 0;
 
-	if (!vcpu->arch.pmu.created)
+	if (!atomic_read(&vcpu->arch.pmu.created))
 		return -EINVAL;
 
 	/*
@@ -887,15 +894,20 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
 
 static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
 {
-	if (irqchip_in_kernel(vcpu->kvm)) {
-		int ret;
+	struct arm_pmu *arm_pmu = vcpu->arch.pmu.arm_pmu;
+	struct kvm *kvm = vcpu->kvm;
+	struct kvm_vcpu *v;
+	int ret = 0;
+	int i;
+
+	if (irqchip_in_kernel(kvm)) {
 
 		/*
 		 * If using the PMU with an in-kernel virtual GIC
 		 * implementation, we require the GIC to be already
 		 * initialized when initializing the PMU.
 		 */
-		if (!vgic_initialized(vcpu->kvm))
+		if (!vgic_initialized(kvm))
 			return -ENODEV;
 
 		if (!kvm_arm_pmu_irq_initialized(vcpu))
@@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
 	init_irq_work(&vcpu->arch.pmu.overflow_work,
 		      kvm_pmu_perf_overflow_notify_vcpu);
 
-	vcpu->arch.pmu.created = true;
+	atomic_set(&vcpu->arch.pmu.created, 1);
+
+	kvm_for_each_vcpu(i, v, kvm) {
+		if (!atomic_read(&v->arch.pmu.created))
+			continue;
+
+		if (v->arch.pmu.arm_pmu != arm_pmu)
+			return -ENXIO;
+	}
+
 	return 0;
 }
 
@@ -940,12 +961,35 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
 	return true;
 }
 
+static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
+{
+	struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
+	struct arm_pmu_entry *entry;
+	struct arm_pmu *arm_pmu;
+	int ret = -ENXIO;
+
+	mutex_lock(&arm_pmus_lock);
+
+	list_for_each_entry(entry, &arm_pmus, entry) {
+		arm_pmu = entry->arm_pmu;
+		if (arm_pmu->pmu.type == pmu_id) {
+			kvm_pmu->arm_pmu = arm_pmu;
+			ret = 0;
+			goto out_unlock;
+		}
+	}
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
+	return ret;
+}
+
 int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 {
 	if (!kvm_vcpu_has_pmu(vcpu))
 		return -ENODEV;
 
-	if (vcpu->arch.pmu.created)
+	if (atomic_read(&vcpu->arch.pmu.created))
 		return -EBUSY;
 
 	if (!vcpu->kvm->arch.pmuver)
@@ -984,7 +1028,7 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 		struct kvm_pmu_event_filter filter;
 		int nr_events;
 
-		nr_events = kvm_pmu_event_mask(vcpu->kvm) + 1;
+		nr_events = kvm_pmu_event_mask(vcpu) + 1;
 
 		uaddr = (struct kvm_pmu_event_filter __user *)(long)attr->addr;
 
@@ -1026,6 +1070,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 
 		return 0;
 	}
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
+		int __user *uaddr = (int __user *)(long)attr->addr;
+		int pmu_id;
+
+		if (get_user(pmu_id, uaddr))
+			return -EFAULT;
+
+		return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
+	}
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 		return kvm_arm_pmu_v3_init(vcpu);
 	}
@@ -1063,6 +1116,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 	case KVM_ARM_VCPU_PMU_V3_IRQ:
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 	case KVM_ARM_VCPU_PMU_V3_FILTER:
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU:
 		if (kvm_vcpu_has_pmu(vcpu))
 			return 0;
 	}
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index e249c5f172aa..892728f85b25 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -7,6 +7,7 @@
 #ifndef __ASM_ARM_KVM_PMU_H
 #define __ASM_ARM_KVM_PMU_H
 
+#include <linux/atomic.h>
 #include <linux/perf_event.h>
 #include <asm/perf_event.h>
 
@@ -31,9 +32,10 @@ struct kvm_pmu {
 	int irq_num;
 	struct kvm_pmc pmc[ARMV8_PMU_MAX_COUNTERS];
 	DECLARE_BITMAP(chained, ARMV8_PMU_MAX_COUNTER_PAIRS);
-	bool created;
+	atomic_t created;
 	bool irq_level;
 	struct irq_work overflow_work;
+	struct arm_pmu *arm_pmu;
 };
 
 struct arm_pmu_entry {
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-13 15:23 ` Alexandru Elisei
@ 2021-12-13 15:23   ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
device ioctl. If the VCPU is scheduled on a physical CPU which has a
different PMU, the perf events needed to emulate a guest PMU won't be
scheduled in and the guest performance counters will stop counting. Treat
it as an userspace error and refuse to run the VCPU in this situation.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 Documentation/virt/kvm/devices/vcpu.rst |  6 ++++-
 arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++
 arch/arm64/include/uapi/asm/kvm.h       |  3 +++
 arch/arm64/kvm/arm.c                    | 29 +++++++++++++++++++++++--
 arch/arm64/kvm/pmu-emul.c               |  1 +
 5 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index b918669bf925..dd8348879a8e 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -129,7 +129,11 @@ the same PMU, otherwise KVM_ARM_VCPU_PMU_V3_INIT will fail.
 
 Note that KVM will not make any attempts to run the VCPU on the physical CPUs
 associated with the PMU specified by this attribute. This is entirely left to
-userspace.
+userspace. However, attempting to run the VCPU on a physical CPU not supported
+by the PMU will fail and KVM_RUN will return with
+exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
+hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
+the cpu field to the processor id.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2a5f7f38006f..0c453f2e48b6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
 		u64 last_steal;
 		gpa_t base;
 	} steal;
+
+	cpumask_var_t supported_cpus;
 };
 
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
@@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
 #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
 #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
+#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
 
 #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
 				 KVM_GUESTDBG_USE_SW_BP | \
@@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
 #define vcpu_has_ptrauth(vcpu)		false
 #endif
 
+#define vcpu_on_unsupported_cpu(vcpu)					\
+	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_set_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_clear_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
+
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
 
 /*
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 1d0a0a2a9711..d49f714f48e6 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
 #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
 #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
 
+/* run->fail_entry.hardware_entry_failure_reason codes. */
+#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
+
 #endif
 
 #endif /* __ARM_KVM_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e4727dc771bf..373e6a3d7221 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
 
+	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
+		return -ENOMEM;
+	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
+
 	/* Set up the timer */
 	kvm_timer_vcpu_init(vcpu);
 
@@ -340,9 +344,16 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	err = kvm_vgic_vcpu_init(vcpu);
 	if (err)
-		return err;
+		goto out_err;
+
+	err = create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
+	if (err)
+		goto out_err;
+	return 0;
 
-	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
+out_err:
+	free_cpumask_var(vcpu->arch.supported_cpus);
+	return err;
 }
 
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
@@ -354,6 +365,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
 		static_branch_dec(&userspace_irqchip_in_use);
 
+	free_cpumask_var(vcpu->arch.supported_cpus);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
 	kvm_timer_vcpu_terminate(vcpu);
 	kvm_pmu_vcpu_destroy(vcpu);
@@ -432,6 +444,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (vcpu_has_ptrauth(vcpu))
 		vcpu_ptrauth_disable(vcpu);
 	kvm_arch_vcpu_load_debug_state_flags(vcpu);
+
+	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
+		vcpu_set_on_unsupported_cpu(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
@@ -444,6 +459,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 	kvm_vgic_put(vcpu);
 	kvm_vcpu_pmu_restore_host(vcpu);
 
+	vcpu_clear_on_unsupported_cpu(vcpu);
 	vcpu->cpu = -1;
 }
 
@@ -759,6 +775,15 @@ static bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu, int *ret)
 		}
 	}
 
+	if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
+		run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+		run->fail_entry.hardware_entry_failure_reason
+			= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
+		run->fail_entry.cpu = smp_processor_id();
+		*ret = 0;
+		return true;
+	}
+
 	return kvm_request_pending(vcpu) ||
 			need_new_vmid_gen(&vcpu->arch.hw_mmu->vmid) ||
 			xfer_to_guest_mode_work_pending();
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 8de38d7fa493..d0581e3258f0 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -974,6 +974,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
 		arm_pmu = entry->arm_pmu;
 		if (arm_pmu->pmu.type == pmu_id) {
 			kvm_pmu->arm_pmu = arm_pmu;
+			cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
 			ret = 0;
 			goto out_unlock;
 		}
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-13 15:23   ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2021-12-13 15:23 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo, peter.maydell

Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
device ioctl. If the VCPU is scheduled on a physical CPU which has a
different PMU, the perf events needed to emulate a guest PMU won't be
scheduled in and the guest performance counters will stop counting. Treat
it as an userspace error and refuse to run the VCPU in this situation.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 Documentation/virt/kvm/devices/vcpu.rst |  6 ++++-
 arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++
 arch/arm64/include/uapi/asm/kvm.h       |  3 +++
 arch/arm64/kvm/arm.c                    | 29 +++++++++++++++++++++++--
 arch/arm64/kvm/pmu-emul.c               |  1 +
 5 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index b918669bf925..dd8348879a8e 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -129,7 +129,11 @@ the same PMU, otherwise KVM_ARM_VCPU_PMU_V3_INIT will fail.
 
 Note that KVM will not make any attempts to run the VCPU on the physical CPUs
 associated with the PMU specified by this attribute. This is entirely left to
-userspace.
+userspace. However, attempting to run the VCPU on a physical CPU not supported
+by the PMU will fail and KVM_RUN will return with
+exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
+hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
+the cpu field to the processor id.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2a5f7f38006f..0c453f2e48b6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
 		u64 last_steal;
 		gpa_t base;
 	} steal;
+
+	cpumask_var_t supported_cpus;
 };
 
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
@@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
 #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
 #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
+#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
 
 #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
 				 KVM_GUESTDBG_USE_SW_BP | \
@@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
 #define vcpu_has_ptrauth(vcpu)		false
 #endif
 
+#define vcpu_on_unsupported_cpu(vcpu)					\
+	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_set_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_clear_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
+
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
 
 /*
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 1d0a0a2a9711..d49f714f48e6 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
 #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
 #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
 
+/* run->fail_entry.hardware_entry_failure_reason codes. */
+#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
+
 #endif
 
 #endif /* __ARM_KVM_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e4727dc771bf..373e6a3d7221 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
 
+	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
+		return -ENOMEM;
+	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
+
 	/* Set up the timer */
 	kvm_timer_vcpu_init(vcpu);
 
@@ -340,9 +344,16 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	err = kvm_vgic_vcpu_init(vcpu);
 	if (err)
-		return err;
+		goto out_err;
+
+	err = create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
+	if (err)
+		goto out_err;
+	return 0;
 
-	return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
+out_err:
+	free_cpumask_var(vcpu->arch.supported_cpus);
+	return err;
 }
 
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
@@ -354,6 +365,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
 		static_branch_dec(&userspace_irqchip_in_use);
 
+	free_cpumask_var(vcpu->arch.supported_cpus);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
 	kvm_timer_vcpu_terminate(vcpu);
 	kvm_pmu_vcpu_destroy(vcpu);
@@ -432,6 +444,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (vcpu_has_ptrauth(vcpu))
 		vcpu_ptrauth_disable(vcpu);
 	kvm_arch_vcpu_load_debug_state_flags(vcpu);
+
+	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
+		vcpu_set_on_unsupported_cpu(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
@@ -444,6 +459,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 	kvm_vgic_put(vcpu);
 	kvm_vcpu_pmu_restore_host(vcpu);
 
+	vcpu_clear_on_unsupported_cpu(vcpu);
 	vcpu->cpu = -1;
 }
 
@@ -759,6 +775,15 @@ static bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu, int *ret)
 		}
 	}
 
+	if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
+		run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+		run->fail_entry.hardware_entry_failure_reason
+			= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
+		run->fail_entry.cpu = smp_processor_id();
+		*ret = 0;
+		return true;
+	}
+
 	return kvm_request_pending(vcpu) ||
 			need_new_vmid_gen(&vcpu->arch.hw_mmu->vmid) ||
 			xfer_to_guest_mode_work_pending();
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 8de38d7fa493..d0581e3258f0 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -974,6 +974,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
 		arm_pmu = entry->arm_pmu;
 		if (arm_pmu->pmu.type == pmu_id) {
 			kvm_pmu->arm_pmu = arm_pmu;
+			cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
 			ret = 0;
 			goto out_unlock;
 		}
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
  2021-12-13 15:23   ` Alexandru Elisei
@ 2021-12-14  7:23     ` Reiji Watanabe
  -1 siblings, 0 replies; 32+ messages in thread
From: Reiji Watanabe @ 2021-12-14  7:23 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Alex,

On Mon, Dec 13, 2021 at 7:23 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
> a hardware PMU is available for guest emulation. Heterogeneous systems can
> have more than one PMU present, and the callback gets called multiple
> times, once for each of them. Keep track of all the PMUs available to KVM,
> as they're going to be needed later.
>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h     |  5 +++++
>  2 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index a5e4bbf5e68f..eb4be96f144d 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -7,6 +7,7 @@
>  #include <linux/cpu.h>
>  #include <linux/kvm.h>
>  #include <linux/kvm_host.h>
> +#include <linux/list.h>
>  #include <linux/perf_event.h>
>  #include <linux/perf/arm_pmu.h>
>  #include <linux/uaccess.h>
> @@ -14,6 +15,9 @@
>  #include <kvm/arm_pmu.h>
>  #include <kvm/arm_vgic.h>
>
> +static LIST_HEAD(arm_pmus);
> +static DEFINE_MUTEX(arm_pmus_lock);
> +
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
> @@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
>
>  void kvm_host_pmu_init(struct arm_pmu *pmu)
>  {
> -       if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> -           !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
> +       struct arm_pmu_entry *entry;
> +
> +       if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
> +           is_protected_kvm_enabled())
> +               return;
> +
> +       mutex_lock(&arm_pmus_lock);
> +
> +       entry = kmalloc(sizeof(*entry), GFP_KERNEL);
> +       if (!entry)
> +               goto out_unlock;

It might be better to get the lock after the kmalloc above is done ?
(the kmalloc might sleep, which will make the code hold the lock longer)
I don't think the current code will cause any problem though.

Reviewed-by: Reiji Watanabe <reijiw@google.com>

Thanks,
Reiji


> +
> +       if (list_empty(&arm_pmus))
>                 static_branch_enable(&kvm_arm_pmu_available);
> +
> +       entry->arm_pmu = pmu;
> +       list_add_tail(&entry->entry, &arm_pmus);
> +
> +out_unlock:
> +       mutex_unlock(&arm_pmus_lock);
>  }
>
>  static int kvm_pmu_probe_pmuver(void)
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 90f21898aad8..e249c5f172aa 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -36,6 +36,11 @@ struct kvm_pmu {
>         struct irq_work overflow_work;
>  };
>
> +struct arm_pmu_entry {
> +       struct list_head entry;
> +       struct arm_pmu *arm_pmu;
> +};
> +
>  #define kvm_arm_pmu_irq_initialized(v) ((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
>  u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
>  void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
> --
> 2.34.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
@ 2021-12-14  7:23     ` Reiji Watanabe
  0 siblings, 0 replies; 32+ messages in thread
From: Reiji Watanabe @ 2021-12-14  7:23 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Alex,

On Mon, Dec 13, 2021 at 7:23 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
> a hardware PMU is available for guest emulation. Heterogeneous systems can
> have more than one PMU present, and the callback gets called multiple
> times, once for each of them. Keep track of all the PMUs available to KVM,
> as they're going to be needed later.
>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h     |  5 +++++
>  2 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index a5e4bbf5e68f..eb4be96f144d 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -7,6 +7,7 @@
>  #include <linux/cpu.h>
>  #include <linux/kvm.h>
>  #include <linux/kvm_host.h>
> +#include <linux/list.h>
>  #include <linux/perf_event.h>
>  #include <linux/perf/arm_pmu.h>
>  #include <linux/uaccess.h>
> @@ -14,6 +15,9 @@
>  #include <kvm/arm_pmu.h>
>  #include <kvm/arm_vgic.h>
>
> +static LIST_HEAD(arm_pmus);
> +static DEFINE_MUTEX(arm_pmus_lock);
> +
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
> @@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
>
>  void kvm_host_pmu_init(struct arm_pmu *pmu)
>  {
> -       if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> -           !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
> +       struct arm_pmu_entry *entry;
> +
> +       if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
> +           is_protected_kvm_enabled())
> +               return;
> +
> +       mutex_lock(&arm_pmus_lock);
> +
> +       entry = kmalloc(sizeof(*entry), GFP_KERNEL);
> +       if (!entry)
> +               goto out_unlock;

It might be better to get the lock after the kmalloc above is done ?
(the kmalloc might sleep, which will make the code hold the lock longer)
I don't think the current code will cause any problem though.

Reviewed-by: Reiji Watanabe <reijiw@google.com>

Thanks,
Reiji


> +
> +       if (list_empty(&arm_pmus))
>                 static_branch_enable(&kvm_arm_pmu_available);
> +
> +       entry->arm_pmu = pmu;
> +       list_add_tail(&entry->entry, &arm_pmus);
> +
> +out_unlock:
> +       mutex_unlock(&arm_pmus_lock);
>  }
>
>  static int kvm_pmu_probe_pmuver(void)
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 90f21898aad8..e249c5f172aa 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -36,6 +36,11 @@ struct kvm_pmu {
>         struct irq_work overflow_work;
>  };
>
> +struct arm_pmu_entry {
> +       struct list_head entry;
> +       struct arm_pmu *arm_pmu;
> +};
> +
>  #define kvm_arm_pmu_irq_initialized(v) ((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
>  u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
>  void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
> --
> 2.34.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-13 15:23   ` Alexandru Elisei
@ 2021-12-14 12:28     ` Marc Zyngier
  -1 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2021-12-14 12:28 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Mon, 13 Dec 2021 15:23:08 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> When KVM creates an event and there are more than one PMUs present on the
> system, perf_init_event() will go through the list of available PMUs and
> will choose the first one that can create the event. The order of the PMUs
> in the PMU list depends on the probe order, which can change under various
> circumstances, for example if the order of the PMU nodes change in the DTB
> or if asynchronous driver probing is enabled on the kernel command line
> (with the driver_async_probe=armv8-pmu option).
> 
> Another consequence of this approach is that, on heteregeneous systems,
> all virtual machines that KVM creates will use the same PMU. This might
> cause unexpected behaviour for userspace: when a VCPU is executing on
> the physical CPU that uses this PMU, PMU events in the guest work
> correctly; but when the same VCPU executes on another CPU, PMU events in
> the guest will suddenly stop counting.
> 
> Fortunately, perf core allows user to specify on which PMU to create an
> event by using the perf_event_attr->type field, which is used by
> perf_init_event() as an index in the radix tree of available PMUs.
> 
> Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> attribute to allow userspace to specify the arm_pmu that KVM will use when
> creating events for that VCPU. KVM will make no attempt to run the VCPU on
> the physical CPUs that share this PMU, leaving it up to userspace to
> manage the VCPU threads' affinity accordingly.
> 
> Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
> asymmetric system to the guest: either all VCPUs have the same PMU, either
> none of the VCPUs have a PMU set. Attempting to do something in between
> will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
> 
> Checking that all VCPUs have the same PMU is done when the PMU is
> initialized because setting the VCPU PMU is optional, and KVM cannot know
> what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
> prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
> changed to an atomic variable because changes to the VCPU PMU state now
> need to be observable by all physical CPUs.
> 
>  Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
>  arch/arm64/include/uapi/asm/kvm.h       |  1 +
>  arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
>  include/kvm/arm_pmu.h                   |  4 +-
>  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
>  5 files changed, 104 insertions(+), 20 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index 60a29972d3f1..b918669bf925 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -49,8 +49,8 @@ Returns:
>  	 =======  ======================================================
>  	 -EEXIST  Interrupt number already used
>  	 -ENODEV  PMUv3 not supported or GIC not initialized
> -	 -ENXIO   PMUv3 not supported, missing VCPU feature or interrupt
> -		  number not set
> +	 -ENXIO   PMUv3 not supported, missing VCPU feature, interrupt
> +		  number not set or mismatched PMUs set
>  	 -EBUSY   PMUv3 already initialized
>  	 =======  ======================================================
>  
> @@ -104,6 +104,32 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
>  isn't strictly speaking an event. Filtering the cycle counter is possible
>  using event 0x11 (CPU_CYCLES).
>  
> +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> +------------------------------------------
> +
> +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> +             identifier.
> +
> +:Returns:
> +
> +	 =======  ===============================================
> +	 -EBUSY   PMUv3 already initialized
> +	 -EFAULT  Error accessing the PMU identifier
> +	 -ENXIO   PMU not found
> +	 -ENODEV  PMUv3 not supported or GIC not initialized
> +	 -ENOMEM  Could not allocate memory
> +	 =======  ===============================================
> +
> +Request that the VCPU uses the specified hardware PMU when creating guest events
> +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> +file for the desired PMU instance under /sys/devices (or, equivalent,
> +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> +systems where there are at least two CPU PMUs on the system. All VCPUs must have
> +the same PMU, otherwise KVM_ARM_VCPU_PMU_V3_INIT will fail.
> +
> +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> +associated with the PMU specified by this attribute. This is entirely left to
> +userspace.
>  
>  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
>  =================================
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..1d0a0a2a9711 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
>  #define   KVM_ARM_VCPU_PMU_V3_INIT	1
>  #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
> +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
>  #define KVM_ARM_VCPU_TIMER_CTRL		1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index eb4be96f144d..8de38d7fa493 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -24,9 +24,16 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
>  
>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>  
> -static u32 kvm_pmu_event_mask(struct kvm *kvm)
> +static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
>  {
> -	switch (kvm->arch.pmuver) {
> +	unsigned int pmuver;
> +
> +	if (vcpu->arch.pmu.arm_pmu)
> +		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
> +	else
> +		pmuver = vcpu->kvm->arch.pmuver;

This puzzles me throughout the whole patch. Why is the arm_pmu pointer
a per-CPU thing? I would absolutely expect it to be stored in the kvm
structure, making the whole thing much simpler.

> +
> +	switch (pmuver) {
>  	case ID_AA64DFR0_PMUVER_8_0:
>  		return GENMASK(9, 0);
>  	case ID_AA64DFR0_PMUVER_8_1:
> @@ -34,7 +41,7 @@ static u32 kvm_pmu_event_mask(struct kvm *kvm)
>  	case ID_AA64DFR0_PMUVER_8_5:
>  		return GENMASK(15, 0);
>  	default:		/* Shouldn't be here, just for sanity */
> -		WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
> +		WARN_ONCE(1, "Unknown PMU version %d\n", pmuver);
>  		return 0;
>  	}
>  }
> @@ -119,7 +126,7 @@ static bool kvm_pmu_idx_has_chain_evtype(struct kvm_vcpu *vcpu, u64 select_idx)
>  		return false;
>  
>  	reg = PMEVTYPER0_EL0 + select_idx;
> -	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu->kvm);
> +	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu);
>  
>  	return eventsel == ARMV8_PMUV3_PERFCTR_CHAIN;
>  }
> @@ -534,7 +541,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
>  
>  		/* PMSWINC only applies to ... SW_INC! */
>  		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
> -		type &= kvm_pmu_event_mask(vcpu->kvm);
> +		type &= kvm_pmu_event_mask(vcpu);
>  		if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
>  			continue;
>  
> @@ -602,6 +609,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  {
>  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
> +	struct arm_pmu *arm_pmu = pmu->arm_pmu;
>  	struct kvm_pmc *pmc;
>  	struct perf_event *event;
>  	struct perf_event_attr attr;
> @@ -622,7 +630,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  	if (pmc->idx == ARMV8_PMU_CYCLE_IDX)
>  		eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES;
>  	else
> -		eventsel = data & kvm_pmu_event_mask(vcpu->kvm);
> +		eventsel = data & kvm_pmu_event_mask(vcpu);
>  
>  	/* Software increment event doesn't need to be backed by a perf event */
>  	if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR)
> @@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  		return;
>  
>  	memset(&attr, 0, sizeof(struct perf_event_attr));
> -	attr.type = PERF_TYPE_RAW;
> -	attr.size = sizeof(attr);

Why is this line removed?

> +	attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
>  	attr.pinned = 1;
>  	attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
>  	attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> @@ -733,7 +740,7 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
>  
>  	mask  =  ARMV8_PMU_EVTYPE_MASK;
>  	mask &= ~ARMV8_PMU_EVTYPE_EVENT;
> -	mask |= kvm_pmu_event_mask(vcpu->kvm);
> +	mask |= kvm_pmu_event_mask(vcpu);
>  
>  	reg = (select_idx == ARMV8_PMU_CYCLE_IDX)
>  	      ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + select_idx;
> @@ -836,7 +843,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
>  	if (!bmap)
>  		return val;
>  
> -	nr_events = kvm_pmu_event_mask(vcpu->kvm) + 1;
> +	nr_events = kvm_pmu_event_mask(vcpu) + 1;
>  
>  	for (i = 0; i < 32; i += 8) {
>  		u64 byte;
> @@ -857,7 +864,7 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
>  	if (!kvm_vcpu_has_pmu(vcpu))
>  		return 0;
>  
> -	if (!vcpu->arch.pmu.created)
> +	if (!atomic_read(&vcpu->arch.pmu.created))
>  		return -EINVAL;
>  
>  	/*
> @@ -887,15 +894,20 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
>  
>  static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
>  {
> -	if (irqchip_in_kernel(vcpu->kvm)) {
> -		int ret;
> +	struct arm_pmu *arm_pmu = vcpu->arch.pmu.arm_pmu;
> +	struct kvm *kvm = vcpu->kvm;
> +	struct kvm_vcpu *v;
> +	int ret = 0;
> +	int i;
> +
> +	if (irqchip_in_kernel(kvm)) {
>  
>  		/*
>  		 * If using the PMU with an in-kernel virtual GIC
>  		 * implementation, we require the GIC to be already
>  		 * initialized when initializing the PMU.
>  		 */
> -		if (!vgic_initialized(vcpu->kvm))
> +		if (!vgic_initialized(kvm))
>  			return -ENODEV;
>  
>  		if (!kvm_arm_pmu_irq_initialized(vcpu))
> @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
>  	init_irq_work(&vcpu->arch.pmu.overflow_work,
>  		      kvm_pmu_perf_overflow_notify_vcpu);
>  
> -	vcpu->arch.pmu.created = true;
> +	atomic_set(&vcpu->arch.pmu.created, 1);
> +
> +	kvm_for_each_vcpu(i, v, kvm) {
> +		if (!atomic_read(&v->arch.pmu.created))
> +			continue;
> +
> +		if (v->arch.pmu.arm_pmu != arm_pmu)
> +			return -ENXIO;
> +	}

If you did store the arm_pmu at the VM level, you wouldn't need this.
You could detect the discrepancy in the set_pmu ioctl.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-14 12:28     ` Marc Zyngier
  0 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2021-12-14 12:28 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

On Mon, 13 Dec 2021 15:23:08 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> When KVM creates an event and there are more than one PMUs present on the
> system, perf_init_event() will go through the list of available PMUs and
> will choose the first one that can create the event. The order of the PMUs
> in the PMU list depends on the probe order, which can change under various
> circumstances, for example if the order of the PMU nodes change in the DTB
> or if asynchronous driver probing is enabled on the kernel command line
> (with the driver_async_probe=armv8-pmu option).
> 
> Another consequence of this approach is that, on heteregeneous systems,
> all virtual machines that KVM creates will use the same PMU. This might
> cause unexpected behaviour for userspace: when a VCPU is executing on
> the physical CPU that uses this PMU, PMU events in the guest work
> correctly; but when the same VCPU executes on another CPU, PMU events in
> the guest will suddenly stop counting.
> 
> Fortunately, perf core allows user to specify on which PMU to create an
> event by using the perf_event_attr->type field, which is used by
> perf_init_event() as an index in the radix tree of available PMUs.
> 
> Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> attribute to allow userspace to specify the arm_pmu that KVM will use when
> creating events for that VCPU. KVM will make no attempt to run the VCPU on
> the physical CPUs that share this PMU, leaving it up to userspace to
> manage the VCPU threads' affinity accordingly.
> 
> Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
> asymmetric system to the guest: either all VCPUs have the same PMU, either
> none of the VCPUs have a PMU set. Attempting to do something in between
> will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
> 
> Checking that all VCPUs have the same PMU is done when the PMU is
> initialized because setting the VCPU PMU is optional, and KVM cannot know
> what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
> prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
> changed to an atomic variable because changes to the VCPU PMU state now
> need to be observable by all physical CPUs.
> 
>  Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
>  arch/arm64/include/uapi/asm/kvm.h       |  1 +
>  arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
>  include/kvm/arm_pmu.h                   |  4 +-
>  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
>  5 files changed, 104 insertions(+), 20 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index 60a29972d3f1..b918669bf925 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -49,8 +49,8 @@ Returns:
>  	 =======  ======================================================
>  	 -EEXIST  Interrupt number already used
>  	 -ENODEV  PMUv3 not supported or GIC not initialized
> -	 -ENXIO   PMUv3 not supported, missing VCPU feature or interrupt
> -		  number not set
> +	 -ENXIO   PMUv3 not supported, missing VCPU feature, interrupt
> +		  number not set or mismatched PMUs set
>  	 -EBUSY   PMUv3 already initialized
>  	 =======  ======================================================
>  
> @@ -104,6 +104,32 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
>  isn't strictly speaking an event. Filtering the cycle counter is possible
>  using event 0x11 (CPU_CYCLES).
>  
> +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> +------------------------------------------
> +
> +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> +             identifier.
> +
> +:Returns:
> +
> +	 =======  ===============================================
> +	 -EBUSY   PMUv3 already initialized
> +	 -EFAULT  Error accessing the PMU identifier
> +	 -ENXIO   PMU not found
> +	 -ENODEV  PMUv3 not supported or GIC not initialized
> +	 -ENOMEM  Could not allocate memory
> +	 =======  ===============================================
> +
> +Request that the VCPU uses the specified hardware PMU when creating guest events
> +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> +file for the desired PMU instance under /sys/devices (or, equivalent,
> +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> +systems where there are at least two CPU PMUs on the system. All VCPUs must have
> +the same PMU, otherwise KVM_ARM_VCPU_PMU_V3_INIT will fail.
> +
> +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> +associated with the PMU specified by this attribute. This is entirely left to
> +userspace.
>  
>  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
>  =================================
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..1d0a0a2a9711 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
>  #define   KVM_ARM_VCPU_PMU_V3_INIT	1
>  #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
> +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
>  #define KVM_ARM_VCPU_TIMER_CTRL		1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index eb4be96f144d..8de38d7fa493 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -24,9 +24,16 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
>  
>  #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1
>  
> -static u32 kvm_pmu_event_mask(struct kvm *kvm)
> +static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
>  {
> -	switch (kvm->arch.pmuver) {
> +	unsigned int pmuver;
> +
> +	if (vcpu->arch.pmu.arm_pmu)
> +		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
> +	else
> +		pmuver = vcpu->kvm->arch.pmuver;

This puzzles me throughout the whole patch. Why is the arm_pmu pointer
a per-CPU thing? I would absolutely expect it to be stored in the kvm
structure, making the whole thing much simpler.

> +
> +	switch (pmuver) {
>  	case ID_AA64DFR0_PMUVER_8_0:
>  		return GENMASK(9, 0);
>  	case ID_AA64DFR0_PMUVER_8_1:
> @@ -34,7 +41,7 @@ static u32 kvm_pmu_event_mask(struct kvm *kvm)
>  	case ID_AA64DFR0_PMUVER_8_5:
>  		return GENMASK(15, 0);
>  	default:		/* Shouldn't be here, just for sanity */
> -		WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
> +		WARN_ONCE(1, "Unknown PMU version %d\n", pmuver);
>  		return 0;
>  	}
>  }
> @@ -119,7 +126,7 @@ static bool kvm_pmu_idx_has_chain_evtype(struct kvm_vcpu *vcpu, u64 select_idx)
>  		return false;
>  
>  	reg = PMEVTYPER0_EL0 + select_idx;
> -	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu->kvm);
> +	eventsel = __vcpu_sys_reg(vcpu, reg) & kvm_pmu_event_mask(vcpu);
>  
>  	return eventsel == ARMV8_PMUV3_PERFCTR_CHAIN;
>  }
> @@ -534,7 +541,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
>  
>  		/* PMSWINC only applies to ... SW_INC! */
>  		type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i);
> -		type &= kvm_pmu_event_mask(vcpu->kvm);
> +		type &= kvm_pmu_event_mask(vcpu);
>  		if (type != ARMV8_PMUV3_PERFCTR_SW_INCR)
>  			continue;
>  
> @@ -602,6 +609,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  {
>  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
> +	struct arm_pmu *arm_pmu = pmu->arm_pmu;
>  	struct kvm_pmc *pmc;
>  	struct perf_event *event;
>  	struct perf_event_attr attr;
> @@ -622,7 +630,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  	if (pmc->idx == ARMV8_PMU_CYCLE_IDX)
>  		eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES;
>  	else
> -		eventsel = data & kvm_pmu_event_mask(vcpu->kvm);
> +		eventsel = data & kvm_pmu_event_mask(vcpu);
>  
>  	/* Software increment event doesn't need to be backed by a perf event */
>  	if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR)
> @@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  		return;
>  
>  	memset(&attr, 0, sizeof(struct perf_event_attr));
> -	attr.type = PERF_TYPE_RAW;
> -	attr.size = sizeof(attr);

Why is this line removed?

> +	attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
>  	attr.pinned = 1;
>  	attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
>  	attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> @@ -733,7 +740,7 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
>  
>  	mask  =  ARMV8_PMU_EVTYPE_MASK;
>  	mask &= ~ARMV8_PMU_EVTYPE_EVENT;
> -	mask |= kvm_pmu_event_mask(vcpu->kvm);
> +	mask |= kvm_pmu_event_mask(vcpu);
>  
>  	reg = (select_idx == ARMV8_PMU_CYCLE_IDX)
>  	      ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + select_idx;
> @@ -836,7 +843,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
>  	if (!bmap)
>  		return val;
>  
> -	nr_events = kvm_pmu_event_mask(vcpu->kvm) + 1;
> +	nr_events = kvm_pmu_event_mask(vcpu) + 1;
>  
>  	for (i = 0; i < 32; i += 8) {
>  		u64 byte;
> @@ -857,7 +864,7 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
>  	if (!kvm_vcpu_has_pmu(vcpu))
>  		return 0;
>  
> -	if (!vcpu->arch.pmu.created)
> +	if (!atomic_read(&vcpu->arch.pmu.created))
>  		return -EINVAL;
>  
>  	/*
> @@ -887,15 +894,20 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
>  
>  static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
>  {
> -	if (irqchip_in_kernel(vcpu->kvm)) {
> -		int ret;
> +	struct arm_pmu *arm_pmu = vcpu->arch.pmu.arm_pmu;
> +	struct kvm *kvm = vcpu->kvm;
> +	struct kvm_vcpu *v;
> +	int ret = 0;
> +	int i;
> +
> +	if (irqchip_in_kernel(kvm)) {
>  
>  		/*
>  		 * If using the PMU with an in-kernel virtual GIC
>  		 * implementation, we require the GIC to be already
>  		 * initialized when initializing the PMU.
>  		 */
> -		if (!vgic_initialized(vcpu->kvm))
> +		if (!vgic_initialized(kvm))
>  			return -ENODEV;
>  
>  		if (!kvm_arm_pmu_irq_initialized(vcpu))
> @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
>  	init_irq_work(&vcpu->arch.pmu.overflow_work,
>  		      kvm_pmu_perf_overflow_notify_vcpu);
>  
> -	vcpu->arch.pmu.created = true;
> +	atomic_set(&vcpu->arch.pmu.created, 1);
> +
> +	kvm_for_each_vcpu(i, v, kvm) {
> +		if (!atomic_read(&v->arch.pmu.created))
> +			continue;
> +
> +		if (v->arch.pmu.arm_pmu != arm_pmu)
> +			return -ENXIO;
> +	}

If you did store the arm_pmu at the VM level, you wouldn't need this.
You could detect the discrepancy in the set_pmu ioctl.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
  2021-12-13 15:23   ` Alexandru Elisei
@ 2021-12-14 12:30     ` Marc Zyngier
  -1 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2021-12-14 12:30 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Mon, 13 Dec 2021 15:23:07 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
> a hardware PMU is available for guest emulation. Heterogeneous systems can
> have more than one PMU present, and the callback gets called multiple
> times, once for each of them. Keep track of all the PMUs available to KVM,
> as they're going to be needed later.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h     |  5 +++++
>  2 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index a5e4bbf5e68f..eb4be96f144d 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -7,6 +7,7 @@
>  #include <linux/cpu.h>
>  #include <linux/kvm.h>
>  #include <linux/kvm_host.h>
> +#include <linux/list.h>
>  #include <linux/perf_event.h>
>  #include <linux/perf/arm_pmu.h>
>  #include <linux/uaccess.h>
> @@ -14,6 +15,9 @@
>  #include <kvm/arm_pmu.h>
>  #include <kvm/arm_vgic.h>
>  
> +static LIST_HEAD(arm_pmus);
> +static DEFINE_MUTEX(arm_pmus_lock);
> +
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
> @@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
>  
>  void kvm_host_pmu_init(struct arm_pmu *pmu)
>  {
> -	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> -	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
> +	struct arm_pmu_entry *entry;
> +
> +	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
> +	    is_protected_kvm_enabled())
> +		return;
> +
> +	mutex_lock(&arm_pmus_lock);
> +
> +	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
> +	if (!entry)
> +		goto out_unlock;
> +
> +	if (list_empty(&arm_pmus))
>  		static_branch_enable(&kvm_arm_pmu_available);

I find it slightly dodgy that you switch the static key before
actually populating the entry. I'd suggest moving it after the
list_add_tail(), and check on list_is_singular() instead.

> +
> +	entry->arm_pmu = pmu;
> +	list_add_tail(&entry->entry, &arm_pmus);
> +
> +out_unlock:
> +	mutex_unlock(&arm_pmus_lock);
>  }
>  
>  static int kvm_pmu_probe_pmuver(void)
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 90f21898aad8..e249c5f172aa 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -36,6 +36,11 @@ struct kvm_pmu {
>  	struct irq_work overflow_work;
>  };
>  
> +struct arm_pmu_entry {
> +	struct list_head entry;
> +	struct arm_pmu *arm_pmu;
> +};
> +
>  #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
>  u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
>  void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
@ 2021-12-14 12:30     ` Marc Zyngier
  0 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2021-12-14 12:30 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

On Mon, 13 Dec 2021 15:23:07 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
> a hardware PMU is available for guest emulation. Heterogeneous systems can
> have more than one PMU present, and the callback gets called multiple
> times, once for each of them. Keep track of all the PMUs available to KVM,
> as they're going to be needed later.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h     |  5 +++++
>  2 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index a5e4bbf5e68f..eb4be96f144d 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -7,6 +7,7 @@
>  #include <linux/cpu.h>
>  #include <linux/kvm.h>
>  #include <linux/kvm_host.h>
> +#include <linux/list.h>
>  #include <linux/perf_event.h>
>  #include <linux/perf/arm_pmu.h>
>  #include <linux/uaccess.h>
> @@ -14,6 +15,9 @@
>  #include <kvm/arm_pmu.h>
>  #include <kvm/arm_vgic.h>
>  
> +static LIST_HEAD(arm_pmus);
> +static DEFINE_MUTEX(arm_pmus_lock);
> +
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
>  static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
> @@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
>  
>  void kvm_host_pmu_init(struct arm_pmu *pmu)
>  {
> -	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> -	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
> +	struct arm_pmu_entry *entry;
> +
> +	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
> +	    is_protected_kvm_enabled())
> +		return;
> +
> +	mutex_lock(&arm_pmus_lock);
> +
> +	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
> +	if (!entry)
> +		goto out_unlock;
> +
> +	if (list_empty(&arm_pmus))
>  		static_branch_enable(&kvm_arm_pmu_available);

I find it slightly dodgy that you switch the static key before
actually populating the entry. I'd suggest moving it after the
list_add_tail(), and check on list_is_singular() instead.

> +
> +	entry->arm_pmu = pmu;
> +	list_add_tail(&entry->entry, &arm_pmus);
> +
> +out_unlock:
> +	mutex_unlock(&arm_pmus_lock);
>  }
>  
>  static int kvm_pmu_probe_pmuver(void)
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 90f21898aad8..e249c5f172aa 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -36,6 +36,11 @@ struct kvm_pmu {
>  	struct irq_work overflow_work;
>  };
>  
> +struct arm_pmu_entry {
> +	struct list_head entry;
> +	struct arm_pmu *arm_pmu;
> +};
> +
>  #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
>  u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
>  void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-13 15:23 ` Alexandru Elisei
@ 2021-12-30 20:01   ` Marc Zyngier
  -1 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2021-12-30 20:01 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Alex,

On Mon, 13 Dec 2021 15:23:05 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> (CC'ing Peter Maydell in case this might be of interest to qemu)
> 
> The series can be found on a branch at [1], and the kvmtool support at [2].
> The kvmtool patches are also on the mailing list [3] and haven't changed
> since v1.
> 
> Detailed explanation of the issue and symptoms that the patches attempt to
> correct can be found in the cover letter for v1 [4].
> 
> A summary of the problem is that on heterogeneous systems KVM will always
> use the same PMU for creating the VCPU events for *all* VCPUs regardless of
> the physical CPU on which the VCPU is running, leading to events suddenly
> stopping and resuming in the guest as the VCPU thread gets migrated across
> different CPUs.
> 
> This series proposes to fix this behaviour by allowing the user to specify
> which physical PMU is used when creating the VCPU events needed for guest
> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> physical which is not part of the supported CPUs for the specified PMU. The
> restriction is that all VCPUs must use the same PMU to avoid emulating an
> asymmetric platform.
> 
> The default behaviour stays the same - without userspace setting the PMU,
> events will stop counting if the VCPU is scheduled on the wrong CPU.
> 
> Tested with a hacked version of kvmtool that does the PMU initialization
> from the VCPU thread as opposed to from the main thread. Tested on
> rockpro64 by testing what happens when all VCPUs having the same PMU, one
> random VCPU having a different PMU than the other VCPUs and one random VCPU
> not having the PMU set (each test was run 1,000 times on the little cores
> and 1,000 times on the big cores).
> 
> Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
> not having a PMU set, and one random VCPU not having the PMU set; the VM
> had 64 threads in each of the tests and each test was run 10,000 times.

Came back to this series, and found more problems. On top of the
remarks I had earlier (the per-CPU data structures that really should
per VM, the disappearing attribute size), what happens when event
filters are already registered and that you set a specific PMU?

I took the matter in my own hands (the joy of being in quarantine) and
wrote whatever fixes I thought were necessary[1].

Please have a look.

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu-bl

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-30 20:01   ` Marc Zyngier
  0 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2021-12-30 20:01 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

Alex,

On Mon, 13 Dec 2021 15:23:05 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> (CC'ing Peter Maydell in case this might be of interest to qemu)
> 
> The series can be found on a branch at [1], and the kvmtool support at [2].
> The kvmtool patches are also on the mailing list [3] and haven't changed
> since v1.
> 
> Detailed explanation of the issue and symptoms that the patches attempt to
> correct can be found in the cover letter for v1 [4].
> 
> A summary of the problem is that on heterogeneous systems KVM will always
> use the same PMU for creating the VCPU events for *all* VCPUs regardless of
> the physical CPU on which the VCPU is running, leading to events suddenly
> stopping and resuming in the guest as the VCPU thread gets migrated across
> different CPUs.
> 
> This series proposes to fix this behaviour by allowing the user to specify
> which physical PMU is used when creating the VCPU events needed for guest
> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> physical which is not part of the supported CPUs for the specified PMU. The
> restriction is that all VCPUs must use the same PMU to avoid emulating an
> asymmetric platform.
> 
> The default behaviour stays the same - without userspace setting the PMU,
> events will stop counting if the VCPU is scheduled on the wrong CPU.
> 
> Tested with a hacked version of kvmtool that does the PMU initialization
> from the VCPU thread as opposed to from the main thread. Tested on
> rockpro64 by testing what happens when all VCPUs having the same PMU, one
> random VCPU having a different PMU than the other VCPUs and one random VCPU
> not having the PMU set (each test was run 1,000 times on the little cores
> and 1,000 times on the big cores).
> 
> Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
> not having a PMU set, and one random VCPU not having the PMU set; the VM
> had 64 threads in each of the tests and each test was run 10,000 times.

Came back to this series, and found more problems. On top of the
remarks I had earlier (the per-CPU data structures that really should
per VM, the disappearing attribute size), what happens when event
filters are already registered and that you set a specific PMU?

I took the matter in my own hands (the joy of being in quarantine) and
wrote whatever fixes I thought were necessary[1].

Please have a look.

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu-bl

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
  2021-12-14 12:30     ` Marc Zyngier
@ 2022-01-06 11:46       ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-06 11:46 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Marc,

Sorry for the long silence, I didn't manage to get to your comments before
going on holiday.

On Tue, Dec 14, 2021 at 12:30:30PM +0000, Marc Zyngier wrote:
> On Mon, 13 Dec 2021 15:23:07 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
> > a hardware PMU is available for guest emulation. Heterogeneous systems can
> > have more than one PMU present, and the callback gets called multiple
> > times, once for each of them. Keep track of all the PMUs available to KVM,
> > as they're going to be needed later.
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
> >  include/kvm/arm_pmu.h     |  5 +++++
> >  2 files changed, 28 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index a5e4bbf5e68f..eb4be96f144d 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/cpu.h>
> >  #include <linux/kvm.h>
> >  #include <linux/kvm_host.h>
> > +#include <linux/list.h>
> >  #include <linux/perf_event.h>
> >  #include <linux/perf/arm_pmu.h>
> >  #include <linux/uaccess.h>
> > @@ -14,6 +15,9 @@
> >  #include <kvm/arm_pmu.h>
> >  #include <kvm/arm_vgic.h>
> >  
> > +static LIST_HEAD(arm_pmus);
> > +static DEFINE_MUTEX(arm_pmus_lock);
> > +
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
> >  static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
> >  static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
> > @@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
> >  
> >  void kvm_host_pmu_init(struct arm_pmu *pmu)
> >  {
> > -	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> > -	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
> > +	struct arm_pmu_entry *entry;
> > +
> > +	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
> > +	    is_protected_kvm_enabled())
> > +		return;
> > +
> > +	mutex_lock(&arm_pmus_lock);
> > +
> > +	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
> > +	if (!entry)
> > +		goto out_unlock;
> > +
> > +	if (list_empty(&arm_pmus))
> >  		static_branch_enable(&kvm_arm_pmu_available);
> 
> I find it slightly dodgy that you switch the static key before
> actually populating the entry. I'd suggest moving it after the
> list_add_tail(), and check on list_is_singular() instead.

That's better, will do.

Thanks,
Alex

> 
> > +
> > +	entry->arm_pmu = pmu;
> > +	list_add_tail(&entry->entry, &arm_pmus);
> > +
> > +out_unlock:
> > +	mutex_unlock(&arm_pmus_lock);
> >  }
> >  
> >  static int kvm_pmu_probe_pmuver(void)
> > diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> > index 90f21898aad8..e249c5f172aa 100644
> > --- a/include/kvm/arm_pmu.h
> > +++ b/include/kvm/arm_pmu.h
> > @@ -36,6 +36,11 @@ struct kvm_pmu {
> >  	struct irq_work overflow_work;
> >  };
> >  
> > +struct arm_pmu_entry {
> > +	struct list_head entry;
> > +	struct arm_pmu *arm_pmu;
> > +};
> > +
> >  #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
> >  u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
> >  void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs
@ 2022-01-06 11:46       ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-06 11:46 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

Hi Marc,

Sorry for the long silence, I didn't manage to get to your comments before
going on holiday.

On Tue, Dec 14, 2021 at 12:30:30PM +0000, Marc Zyngier wrote:
> On Mon, 13 Dec 2021 15:23:07 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
> > a hardware PMU is available for guest emulation. Heterogeneous systems can
> > have more than one PMU present, and the callback gets called multiple
> > times, once for each of them. Keep track of all the PMUs available to KVM,
> > as they're going to be needed later.
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  arch/arm64/kvm/pmu-emul.c | 25 +++++++++++++++++++++++--
> >  include/kvm/arm_pmu.h     |  5 +++++
> >  2 files changed, 28 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index a5e4bbf5e68f..eb4be96f144d 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/cpu.h>
> >  #include <linux/kvm.h>
> >  #include <linux/kvm_host.h>
> > +#include <linux/list.h>
> >  #include <linux/perf_event.h>
> >  #include <linux/perf/arm_pmu.h>
> >  #include <linux/uaccess.h>
> > @@ -14,6 +15,9 @@
> >  #include <kvm/arm_pmu.h>
> >  #include <kvm/arm_vgic.h>
> >  
> > +static LIST_HEAD(arm_pmus);
> > +static DEFINE_MUTEX(arm_pmus_lock);
> > +
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
> >  static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
> >  static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
> > @@ -742,9 +746,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
> >  
> >  void kvm_host_pmu_init(struct arm_pmu *pmu)
> >  {
> > -	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
> > -	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
> > +	struct arm_pmu_entry *entry;
> > +
> > +	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
> > +	    is_protected_kvm_enabled())
> > +		return;
> > +
> > +	mutex_lock(&arm_pmus_lock);
> > +
> > +	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
> > +	if (!entry)
> > +		goto out_unlock;
> > +
> > +	if (list_empty(&arm_pmus))
> >  		static_branch_enable(&kvm_arm_pmu_available);
> 
> I find it slightly dodgy that you switch the static key before
> actually populating the entry. I'd suggest moving it after the
> list_add_tail(), and check on list_is_singular() instead.

That's better, will do.

Thanks,
Alex

> 
> > +
> > +	entry->arm_pmu = pmu;
> > +	list_add_tail(&entry->entry, &arm_pmus);
> > +
> > +out_unlock:
> > +	mutex_unlock(&arm_pmus_lock);
> >  }
> >  
> >  static int kvm_pmu_probe_pmuver(void)
> > diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> > index 90f21898aad8..e249c5f172aa 100644
> > --- a/include/kvm/arm_pmu.h
> > +++ b/include/kvm/arm_pmu.h
> > @@ -36,6 +36,11 @@ struct kvm_pmu {
> >  	struct irq_work overflow_work;
> >  };
> >  
> > +struct arm_pmu_entry {
> > +	struct list_head entry;
> > +	struct arm_pmu *arm_pmu;
> > +};
> > +
> >  #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
> >  u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
> >  void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-14 12:28     ` Marc Zyngier
@ 2022-01-06 11:54       ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-06 11:54 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Marc,

On Tue, Dec 14, 2021 at 12:28:15PM +0000, Marc Zyngier wrote:
> On Mon, 13 Dec 2021 15:23:08 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > When KVM creates an event and there are more than one PMUs present on the
> > system, perf_init_event() will go through the list of available PMUs and
> > will choose the first one that can create the event. The order of the PMUs
> > in the PMU list depends on the probe order, which can change under various
> > circumstances, for example if the order of the PMU nodes change in the DTB
> > or if asynchronous driver probing is enabled on the kernel command line
> > (with the driver_async_probe=armv8-pmu option).
> > 
> > Another consequence of this approach is that, on heteregeneous systems,
> > all virtual machines that KVM creates will use the same PMU. This might
> > cause unexpected behaviour for userspace: when a VCPU is executing on
> > the physical CPU that uses this PMU, PMU events in the guest work
> > correctly; but when the same VCPU executes on another CPU, PMU events in
> > the guest will suddenly stop counting.
> > 
> > Fortunately, perf core allows user to specify on which PMU to create an
> > event by using the perf_event_attr->type field, which is used by
> > perf_init_event() as an index in the radix tree of available PMUs.
> > 
> > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > the physical CPUs that share this PMU, leaving it up to userspace to
> > manage the VCPU threads' affinity accordingly.
> > 
> > Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
> > asymmetric system to the guest: either all VCPUs have the same PMU, either
> > none of the VCPUs have a PMU set. Attempting to do something in between
> > will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> > 
> > Checking that all VCPUs have the same PMU is done when the PMU is
> > initialized because setting the VCPU PMU is optional, and KVM cannot know
> > what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
> > prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
> > changed to an atomic variable because changes to the VCPU PMU state now
> > need to be observable by all physical CPUs.
> > 
> >  Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
> >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> >  arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
> >  include/kvm/arm_pmu.h                   |  4 +-
> >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  5 files changed, 104 insertions(+), 20 deletions(-)
> > 
> > [..]
> > -static u32 kvm_pmu_event_mask(struct kvm *kvm)
> > +static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
> >  {
> > -	switch (kvm->arch.pmuver) {
> > +	unsigned int pmuver;
> > +
> > +	if (vcpu->arch.pmu.arm_pmu)
> > +		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
> > +	else
> > +		pmuver = vcpu->kvm->arch.pmuver;
> 
> This puzzles me throughout the whole patch. Why is the arm_pmu pointer
> a per-CPU thing? I would absolutely expect it to be stored in the kvm
> structure, making the whole thing much simpler.

Reply below.

> 
> > [..]
> > @@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> >  		return;
> >  
> >  	memset(&attr, 0, sizeof(struct perf_event_attr));
> > -	attr.type = PERF_TYPE_RAW;
> > -	attr.size = sizeof(attr);
> 
> Why is this line removed?

Typo on my part, thank you for spotting it.

> 
> > [..]
> > @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> >  	init_irq_work(&vcpu->arch.pmu.overflow_work,
> >  		      kvm_pmu_perf_overflow_notify_vcpu);
> >  
> > -	vcpu->arch.pmu.created = true;
> > +	atomic_set(&vcpu->arch.pmu.created, 1);
> > +
> > +	kvm_for_each_vcpu(i, v, kvm) {
> > +		if (!atomic_read(&v->arch.pmu.created))
> > +			continue;
> > +
> > +		if (v->arch.pmu.arm_pmu != arm_pmu)
> > +			return -ENXIO;
> > +	}
> 
> If you did store the arm_pmu at the VM level, you wouldn't need this.
> You could detect the discrepancy in the set_pmu ioctl.

I chose to set at the VCPU level to be consistent with how KVM treats the
PMU interrupt ID when the interrupt is a PPI, where the interrupt ID must
be the same for all VCPUs and it is stored at the VCPU. However, looking at
the code again, it occurs to me that it is stored at the VCPU when it's a
PPI because it's simpler to do it that way, as the code remains the same
when the interrupt ID is a SPI, which must be *different* between VCPUs. So
in the end, having the PMU stored at the VM level does match how KVM uses
it, which looks to be better than my approach.

This is the change you proposed in your branch [1]:

+static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
+{
+       struct kvm *kvm = vcpu->kvm;
+       struct arm_pmu_entry *entry;
+       struct arm_pmu *arm_pmu;
+       int ret = -ENXIO;
+
+       mutex_lock(&kvm->lock);
+       mutex_lock(&arm_pmus_lock);
+
+       list_for_each_entry(entry, &arm_pmus, entry) {
+               arm_pmu = entry->arm_pmu;
+               if (arm_pmu->pmu.type == pmu_id) {
+                       /* Can't change PMU if filters are already in place */
+                       if (kvm->arch.arm_pmu != arm_pmu &&
+                           kvm->arch.pmu_filter) {
+                               ret = -EBUSY;
+                               break;
+                       }
+
+                       kvm->arch.arm_pmu = arm_pmu;
+                       ret = 0;
+                       break;
+               }
+       }
+
+       mutex_unlock(&arm_pmus_lock);
+       mutex_unlock(&kvm->lock);
+       return ret;
+}

As I understand the code, userspace only needs to call
KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) *once* (on one VCPU
fd) to set the PMU for all the VCPUs; subsequent calls (on the same VCPU or
on another VCPU) with a different PMU id will change the PMU for all VCPUs.

Two remarks:

1. The documentation for the VCPU ioctls states this (from
Documentation/virt/kvm/devices/vcpu.rst):

"
======================
Generic vcpu interface
======================

The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
kvm_device_attr as other devices, but **targets VCPU-wide settings and
controls**" (emphasis added).

But I guess having VCPU ioctls affect *only* the VCPU hasn't really been
true ever since PMU event filtering has been added. I'll send a patch to
change that part of the documentation for arm64.

I was thinking maybe a VM capability would be better suited for changing a
VM-wide setting, what do you think? I don't have a strong preference either
way.

2. What's to stop userspace to change the PMU after at least one VCPU has
run? That can be easily observed by the guest when reading PMCEIDx_EL0.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu-bl

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2022-01-06 11:54       ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-06 11:54 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

Hi Marc,

On Tue, Dec 14, 2021 at 12:28:15PM +0000, Marc Zyngier wrote:
> On Mon, 13 Dec 2021 15:23:08 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > When KVM creates an event and there are more than one PMUs present on the
> > system, perf_init_event() will go through the list of available PMUs and
> > will choose the first one that can create the event. The order of the PMUs
> > in the PMU list depends on the probe order, which can change under various
> > circumstances, for example if the order of the PMU nodes change in the DTB
> > or if asynchronous driver probing is enabled on the kernel command line
> > (with the driver_async_probe=armv8-pmu option).
> > 
> > Another consequence of this approach is that, on heteregeneous systems,
> > all virtual machines that KVM creates will use the same PMU. This might
> > cause unexpected behaviour for userspace: when a VCPU is executing on
> > the physical CPU that uses this PMU, PMU events in the guest work
> > correctly; but when the same VCPU executes on another CPU, PMU events in
> > the guest will suddenly stop counting.
> > 
> > Fortunately, perf core allows user to specify on which PMU to create an
> > event by using the perf_event_attr->type field, which is used by
> > perf_init_event() as an index in the radix tree of available PMUs.
> > 
> > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > the physical CPUs that share this PMU, leaving it up to userspace to
> > manage the VCPU threads' affinity accordingly.
> > 
> > Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
> > asymmetric system to the guest: either all VCPUs have the same PMU, either
> > none of the VCPUs have a PMU set. Attempting to do something in between
> > will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> > 
> > Checking that all VCPUs have the same PMU is done when the PMU is
> > initialized because setting the VCPU PMU is optional, and KVM cannot know
> > what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
> > prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
> > changed to an atomic variable because changes to the VCPU PMU state now
> > need to be observable by all physical CPUs.
> > 
> >  Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
> >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> >  arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
> >  include/kvm/arm_pmu.h                   |  4 +-
> >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  5 files changed, 104 insertions(+), 20 deletions(-)
> > 
> > [..]
> > -static u32 kvm_pmu_event_mask(struct kvm *kvm)
> > +static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
> >  {
> > -	switch (kvm->arch.pmuver) {
> > +	unsigned int pmuver;
> > +
> > +	if (vcpu->arch.pmu.arm_pmu)
> > +		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
> > +	else
> > +		pmuver = vcpu->kvm->arch.pmuver;
> 
> This puzzles me throughout the whole patch. Why is the arm_pmu pointer
> a per-CPU thing? I would absolutely expect it to be stored in the kvm
> structure, making the whole thing much simpler.

Reply below.

> 
> > [..]
> > @@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> >  		return;
> >  
> >  	memset(&attr, 0, sizeof(struct perf_event_attr));
> > -	attr.type = PERF_TYPE_RAW;
> > -	attr.size = sizeof(attr);
> 
> Why is this line removed?

Typo on my part, thank you for spotting it.

> 
> > [..]
> > @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> >  	init_irq_work(&vcpu->arch.pmu.overflow_work,
> >  		      kvm_pmu_perf_overflow_notify_vcpu);
> >  
> > -	vcpu->arch.pmu.created = true;
> > +	atomic_set(&vcpu->arch.pmu.created, 1);
> > +
> > +	kvm_for_each_vcpu(i, v, kvm) {
> > +		if (!atomic_read(&v->arch.pmu.created))
> > +			continue;
> > +
> > +		if (v->arch.pmu.arm_pmu != arm_pmu)
> > +			return -ENXIO;
> > +	}
> 
> If you did store the arm_pmu at the VM level, you wouldn't need this.
> You could detect the discrepancy in the set_pmu ioctl.

I chose to set at the VCPU level to be consistent with how KVM treats the
PMU interrupt ID when the interrupt is a PPI, where the interrupt ID must
be the same for all VCPUs and it is stored at the VCPU. However, looking at
the code again, it occurs to me that it is stored at the VCPU when it's a
PPI because it's simpler to do it that way, as the code remains the same
when the interrupt ID is a SPI, which must be *different* between VCPUs. So
in the end, having the PMU stored at the VM level does match how KVM uses
it, which looks to be better than my approach.

This is the change you proposed in your branch [1]:

+static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
+{
+       struct kvm *kvm = vcpu->kvm;
+       struct arm_pmu_entry *entry;
+       struct arm_pmu *arm_pmu;
+       int ret = -ENXIO;
+
+       mutex_lock(&kvm->lock);
+       mutex_lock(&arm_pmus_lock);
+
+       list_for_each_entry(entry, &arm_pmus, entry) {
+               arm_pmu = entry->arm_pmu;
+               if (arm_pmu->pmu.type == pmu_id) {
+                       /* Can't change PMU if filters are already in place */
+                       if (kvm->arch.arm_pmu != arm_pmu &&
+                           kvm->arch.pmu_filter) {
+                               ret = -EBUSY;
+                               break;
+                       }
+
+                       kvm->arch.arm_pmu = arm_pmu;
+                       ret = 0;
+                       break;
+               }
+       }
+
+       mutex_unlock(&arm_pmus_lock);
+       mutex_unlock(&kvm->lock);
+       return ret;
+}

As I understand the code, userspace only needs to call
KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) *once* (on one VCPU
fd) to set the PMU for all the VCPUs; subsequent calls (on the same VCPU or
on another VCPU) with a different PMU id will change the PMU for all VCPUs.

Two remarks:

1. The documentation for the VCPU ioctls states this (from
Documentation/virt/kvm/devices/vcpu.rst):

"
======================
Generic vcpu interface
======================

The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
kvm_device_attr as other devices, but **targets VCPU-wide settings and
controls**" (emphasis added).

But I guess having VCPU ioctls affect *only* the VCPU hasn't really been
true ever since PMU event filtering has been added. I'll send a patch to
change that part of the documentation for arm64.

I was thinking maybe a VM capability would be better suited for changing a
VM-wide setting, what do you think? I don't have a strong preference either
way.

2. What's to stop userspace to change the PMU after at least one VCPU has
run? That can be easily observed by the guest when reading PMCEIDx_EL0.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu-bl

Thanks,
Alex

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-30 20:01   ` Marc Zyngier
@ 2022-01-06 12:07     ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-06 12:07 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Marc,

On Thu, Dec 30, 2021 at 08:01:10PM +0000, Marc Zyngier wrote:
> Alex,
> 
> On Mon, 13 Dec 2021 15:23:05 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > (CC'ing Peter Maydell in case this might be of interest to qemu)
> > 
> > The series can be found on a branch at [1], and the kvmtool support at [2].
> > The kvmtool patches are also on the mailing list [3] and haven't changed
> > since v1.
> > 
> > Detailed explanation of the issue and symptoms that the patches attempt to
> > correct can be found in the cover letter for v1 [4].
> > 
> > A summary of the problem is that on heterogeneous systems KVM will always
> > use the same PMU for creating the VCPU events for *all* VCPUs regardless of
> > the physical CPU on which the VCPU is running, leading to events suddenly
> > stopping and resuming in the guest as the VCPU thread gets migrated across
> > different CPUs.
> > 
> > This series proposes to fix this behaviour by allowing the user to specify
> > which physical PMU is used when creating the VCPU events needed for guest
> > PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > physical which is not part of the supported CPUs for the specified PMU. The
> > restriction is that all VCPUs must use the same PMU to avoid emulating an
> > asymmetric platform.
> > 
> > The default behaviour stays the same - without userspace setting the PMU,
> > events will stop counting if the VCPU is scheduled on the wrong CPU.
> > 
> > Tested with a hacked version of kvmtool that does the PMU initialization
> > from the VCPU thread as opposed to from the main thread. Tested on
> > rockpro64 by testing what happens when all VCPUs having the same PMU, one
> > random VCPU having a different PMU than the other VCPUs and one random VCPU
> > not having the PMU set (each test was run 1,000 times on the little cores
> > and 1,000 times on the big cores).
> > 
> > Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
> > not having a PMU set, and one random VCPU not having the PMU set; the VM
> > had 64 threads in each of the tests and each test was run 10,000 times.
> 
> Came back to this series, and found more problems. On top of the
> remarks I had earlier (the per-CPU data structures that really should
> per VM, the disappearing attribute size), what happens when event
> filters are already registered and that you set a specific PMU?

This is a good point. When I looked at how the PMU event filter works, I
saw that KVM doesn't attempt to check that the events are actually
implemented on the PMU, but somehow skipped over the fact that the PMU
affects the total number of events available.

Thanks,
Alex

> 
> I took the matter in my own hands (the joy of being in quarantine) and
> wrote whatever fixes I thought were necessary[1].
> 
> Please have a look.
> 
> 	M.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu-bl
> 
> -- 
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2022-01-06 12:07     ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-06 12:07 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

Hi Marc,

On Thu, Dec 30, 2021 at 08:01:10PM +0000, Marc Zyngier wrote:
> Alex,
> 
> On Mon, 13 Dec 2021 15:23:05 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > (CC'ing Peter Maydell in case this might be of interest to qemu)
> > 
> > The series can be found on a branch at [1], and the kvmtool support at [2].
> > The kvmtool patches are also on the mailing list [3] and haven't changed
> > since v1.
> > 
> > Detailed explanation of the issue and symptoms that the patches attempt to
> > correct can be found in the cover letter for v1 [4].
> > 
> > A summary of the problem is that on heterogeneous systems KVM will always
> > use the same PMU for creating the VCPU events for *all* VCPUs regardless of
> > the physical CPU on which the VCPU is running, leading to events suddenly
> > stopping and resuming in the guest as the VCPU thread gets migrated across
> > different CPUs.
> > 
> > This series proposes to fix this behaviour by allowing the user to specify
> > which physical PMU is used when creating the VCPU events needed for guest
> > PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > physical which is not part of the supported CPUs for the specified PMU. The
> > restriction is that all VCPUs must use the same PMU to avoid emulating an
> > asymmetric platform.
> > 
> > The default behaviour stays the same - without userspace setting the PMU,
> > events will stop counting if the VCPU is scheduled on the wrong CPU.
> > 
> > Tested with a hacked version of kvmtool that does the PMU initialization
> > from the VCPU thread as opposed to from the main thread. Tested on
> > rockpro64 by testing what happens when all VCPUs having the same PMU, one
> > random VCPU having a different PMU than the other VCPUs and one random VCPU
> > not having the PMU set (each test was run 1,000 times on the little cores
> > and 1,000 times on the big cores).
> > 
> > Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
> > not having a PMU set, and one random VCPU not having the PMU set; the VM
> > had 64 threads in each of the tests and each test was run 10,000 times.
> 
> Came back to this series, and found more problems. On top of the
> remarks I had earlier (the per-CPU data structures that really should
> per VM, the disappearing attribute size), what happens when event
> filters are already registered and that you set a specific PMU?

This is a good point. When I looked at how the PMU event filter works, I
saw that KVM doesn't attempt to check that the events are actually
implemented on the PMU, but somehow skipped over the fact that the PMU
affects the total number of events available.

Thanks,
Alex

> 
> I took the matter in my own hands (the joy of being in quarantine) and
> wrote whatever fixes I thought were necessary[1].
> 
> Please have a look.
> 
> 	M.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu-bl
> 
> -- 
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2022-01-06 11:54       ` Alexandru Elisei
@ 2022-01-06 18:16         ` Marc Zyngier
  -1 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2022-01-06 18:16 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Thu, 06 Jan 2022 11:54:11 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Tue, Dec 14, 2021 at 12:28:15PM +0000, Marc Zyngier wrote:
> > On Mon, 13 Dec 2021 15:23:08 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > When KVM creates an event and there are more than one PMUs present on the
> > > system, perf_init_event() will go through the list of available PMUs and
> > > will choose the first one that can create the event. The order of the PMUs
> > > in the PMU list depends on the probe order, which can change under various
> > > circumstances, for example if the order of the PMU nodes change in the DTB
> > > or if asynchronous driver probing is enabled on the kernel command line
> > > (with the driver_async_probe=armv8-pmu option).
> > > 
> > > Another consequence of this approach is that, on heteregeneous systems,
> > > all virtual machines that KVM creates will use the same PMU. This might
> > > cause unexpected behaviour for userspace: when a VCPU is executing on
> > > the physical CPU that uses this PMU, PMU events in the guest work
> > > correctly; but when the same VCPU executes on another CPU, PMU events in
> > > the guest will suddenly stop counting.
> > > 
> > > Fortunately, perf core allows user to specify on which PMU to create an
> > > event by using the perf_event_attr->type field, which is used by
> > > perf_init_event() as an index in the radix tree of available PMUs.
> > > 
> > > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > > the physical CPUs that share this PMU, leaving it up to userspace to
> > > manage the VCPU threads' affinity accordingly.
> > > 
> > > Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
> > > asymmetric system to the guest: either all VCPUs have the same PMU, either
> > > none of the VCPUs have a PMU set. Attempting to do something in between
> > > will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.
> > > 
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > > 
> > > Checking that all VCPUs have the same PMU is done when the PMU is
> > > initialized because setting the VCPU PMU is optional, and KVM cannot know
> > > what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
> > > prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
> > > changed to an atomic variable because changes to the VCPU PMU state now
> > > need to be observable by all physical CPUs.
> > > 
> > >  Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
> > >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> > >  arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
> > >  include/kvm/arm_pmu.h                   |  4 +-
> > >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> > >  5 files changed, 104 insertions(+), 20 deletions(-)
> > > 
> > > [..]
> > > -static u32 kvm_pmu_event_mask(struct kvm *kvm)
> > > +static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
> > >  {
> > > -	switch (kvm->arch.pmuver) {
> > > +	unsigned int pmuver;
> > > +
> > > +	if (vcpu->arch.pmu.arm_pmu)
> > > +		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
> > > +	else
> > > +		pmuver = vcpu->kvm->arch.pmuver;
> > 
> > This puzzles me throughout the whole patch. Why is the arm_pmu pointer
> > a per-CPU thing? I would absolutely expect it to be stored in the kvm
> > structure, making the whole thing much simpler.
> 
> Reply below.
> 
> > 
> > > [..]
> > > @@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> > >  		return;
> > >  
> > >  	memset(&attr, 0, sizeof(struct perf_event_attr));
> > > -	attr.type = PERF_TYPE_RAW;
> > > -	attr.size = sizeof(attr);
> > 
> > Why is this line removed?
> 
> Typo on my part, thank you for spotting it.
> 
> > 
> > > [..]
> > > @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> > >  	init_irq_work(&vcpu->arch.pmu.overflow_work,
> > >  		      kvm_pmu_perf_overflow_notify_vcpu);
> > >  
> > > -	vcpu->arch.pmu.created = true;
> > > +	atomic_set(&vcpu->arch.pmu.created, 1);
> > > +
> > > +	kvm_for_each_vcpu(i, v, kvm) {
> > > +		if (!atomic_read(&v->arch.pmu.created))
> > > +			continue;
> > > +
> > > +		if (v->arch.pmu.arm_pmu != arm_pmu)
> > > +			return -ENXIO;
> > > +	}
> > 
> > If you did store the arm_pmu at the VM level, you wouldn't need this.
> > You could detect the discrepancy in the set_pmu ioctl.
> 
> I chose to set at the VCPU level to be consistent with how KVM treats the
> PMU interrupt ID when the interrupt is a PPI, where the interrupt ID must
> be the same for all VCPUs and it is stored at the VCPU. However, looking at
> the code again, it occurs to me that it is stored at the VCPU when it's a
> PPI because it's simpler to do it that way, as the code remains the same
> when the interrupt ID is a SPI, which must be *different* between VCPUs. So
> in the end, having the PMU stored at the VM level does match how KVM uses
> it, which looks to be better than my approach.
> 
> This is the change you proposed in your branch [1]:
> 
> +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> +{
> +       struct kvm *kvm = vcpu->kvm;
> +       struct arm_pmu_entry *entry;
> +       struct arm_pmu *arm_pmu;
> +       int ret = -ENXIO;
> +
> +       mutex_lock(&kvm->lock);
> +       mutex_lock(&arm_pmus_lock);
> +
> +       list_for_each_entry(entry, &arm_pmus, entry) {
> +               arm_pmu = entry->arm_pmu;
> +               if (arm_pmu->pmu.type == pmu_id) {
> +                       /* Can't change PMU if filters are already in place */
> +                       if (kvm->arch.arm_pmu != arm_pmu &&
> +                           kvm->arch.pmu_filter) {
> +                               ret = -EBUSY;
> +                               break;
> +                       }
> +
> +                       kvm->arch.arm_pmu = arm_pmu;
> +                       ret = 0;
> +                       break;
> +               }
> +       }
> +
> +       mutex_unlock(&arm_pmus_lock);
> +       mutex_unlock(&kvm->lock);
> +       return ret;
> +}
> 
> As I understand the code, userspace only needs to call
> KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) *once* (on one VCPU
> fd) to set the PMU for all the VCPUs; subsequent calls (on the same VCPU or
> on another VCPU) with a different PMU id will change the PMU for all VCPUs.
> 
> Two remarks:
> 
> 1. The documentation for the VCPU ioctls states this (from
> Documentation/virt/kvm/devices/vcpu.rst):
> 
> "
> ======================
> Generic vcpu interface
> ======================
> 
> The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
> KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
> kvm_device_attr as other devices, but **targets VCPU-wide settings and
> controls**" (emphasis added).
> 
> But I guess having VCPU ioctls affect *only* the VCPU hasn't really been
> true ever since PMU event filtering has been added. I'll send a patch to
> change that part of the documentation for arm64.
> 
> I was thinking maybe a VM capability would be better suited for changing a
> VM-wide setting, what do you think? I don't have a strong preference either
> way.

I'm not sure it is worth the hassle of changing the API, as we'll have
to keep the current one forever.

> 
> 2. What's to stop userspace to change the PMU after at least one VCPU has
> run? That can be easily observed by the guest when reading PMCEIDx_EL0.

That's a good point. We need something here. It is a bit odd as to do
that, you need to fully enable a PMU on one CPU, but not on the other,
then run the first while changing stuff on the other. Something along
those lines (untested):

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4bf28905d438..4f53520e84fd 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -139,6 +139,7 @@ struct kvm_arch {
 
 	/* Memory Tagging Extension enabled for the guest */
 	bool mte_enabled;
+	bool ran_once;
 };
 
 struct kvm_vcpu_fault_info {
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 83297fa97243..3045d7f609df 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -606,6 +606,10 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.has_run_once = true;
 
+	mutex_lock(&kvm->lock);
+	kvm->arch.ran_once = true;
+	mutex_unlock(&kvm->lock);
+
 	kvm_arm_vcpu_init_debug(vcpu);
 
 	if (likely(irqchip_in_kernel(kvm))) {
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index dfc0430d6418..95100c541244 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -959,8 +959,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
 		arm_pmu = entry->arm_pmu;
 		if (arm_pmu->pmu.type == pmu_id) {
 			/* Can't change PMU if filters are already in place */
-			if (kvm->arch.arm_pmu != arm_pmu &&
-			    kvm->arch.pmu_filter) {
+			if ((kvm->arch.arm_pmu != arm_pmu &&
+			     kvm->arch.pmu_filter) ||
+			    kvm->arch.ran_once) {
 				ret = -EBUSY;
 				break;
 			}
@@ -1040,6 +1041,11 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 
 		mutex_lock(&vcpu->kvm->lock);
 
+		if (vcpu->kvm->arch.ran_once) {
+			mutex_unlock(&vcpu->kvm->lock);
+			return -EBUSY;
+		}
+
 		if (!vcpu->kvm->arch.pmu_filter) {
 			vcpu->kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
 			if (!vcpu->kvm->arch.pmu_filter) {

which should prevent both PMU or filters to be changed once a single
vcpu as run.

Thoughts?

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2022-01-06 18:16         ` Marc Zyngier
  0 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2022-01-06 18:16 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

On Thu, 06 Jan 2022 11:54:11 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Tue, Dec 14, 2021 at 12:28:15PM +0000, Marc Zyngier wrote:
> > On Mon, 13 Dec 2021 15:23:08 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > When KVM creates an event and there are more than one PMUs present on the
> > > system, perf_init_event() will go through the list of available PMUs and
> > > will choose the first one that can create the event. The order of the PMUs
> > > in the PMU list depends on the probe order, which can change under various
> > > circumstances, for example if the order of the PMU nodes change in the DTB
> > > or if asynchronous driver probing is enabled on the kernel command line
> > > (with the driver_async_probe=armv8-pmu option).
> > > 
> > > Another consequence of this approach is that, on heteregeneous systems,
> > > all virtual machines that KVM creates will use the same PMU. This might
> > > cause unexpected behaviour for userspace: when a VCPU is executing on
> > > the physical CPU that uses this PMU, PMU events in the guest work
> > > correctly; but when the same VCPU executes on another CPU, PMU events in
> > > the guest will suddenly stop counting.
> > > 
> > > Fortunately, perf core allows user to specify on which PMU to create an
> > > event by using the perf_event_attr->type field, which is used by
> > > perf_init_event() as an index in the radix tree of available PMUs.
> > > 
> > > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > > the physical CPUs that share this PMU, leaving it up to userspace to
> > > manage the VCPU threads' affinity accordingly.
> > > 
> > > Setting the PMU for a VCPU is an all of nothing affair to avoid exposing an
> > > asymmetric system to the guest: either all VCPUs have the same PMU, either
> > > none of the VCPUs have a PMU set. Attempting to do something in between
> > > will result in an error being returned when doing KVM_ARM_VCPU_PMU_V3_INIT.
> > > 
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > > 
> > > Checking that all VCPUs have the same PMU is done when the PMU is
> > > initialized because setting the VCPU PMU is optional, and KVM cannot know
> > > what the user intends until the KVM_ARM_VCPU_PMU_V3_INIT ioctl, which
> > > prevents further changes to the VCPU PMU. vcpu->arch.pmu.created has been
> > > changed to an atomic variable because changes to the VCPU PMU state now
> > > need to be observable by all physical CPUs.
> > > 
> > >  Documentation/virt/kvm/devices/vcpu.rst | 30 ++++++++-
> > >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> > >  arch/arm64/kvm/pmu-emul.c               | 88 ++++++++++++++++++++-----
> > >  include/kvm/arm_pmu.h                   |  4 +-
> > >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> > >  5 files changed, 104 insertions(+), 20 deletions(-)
> > > 
> > > [..]
> > > -static u32 kvm_pmu_event_mask(struct kvm *kvm)
> > > +static u32 kvm_pmu_event_mask(struct kvm_vcpu *vcpu)
> > >  {
> > > -	switch (kvm->arch.pmuver) {
> > > +	unsigned int pmuver;
> > > +
> > > +	if (vcpu->arch.pmu.arm_pmu)
> > > +		pmuver = vcpu->arch.pmu.arm_pmu->pmuver;
> > > +	else
> > > +		pmuver = vcpu->kvm->arch.pmuver;
> > 
> > This puzzles me throughout the whole patch. Why is the arm_pmu pointer
> > a per-CPU thing? I would absolutely expect it to be stored in the kvm
> > structure, making the whole thing much simpler.
> 
> Reply below.
> 
> > 
> > > [..]
> > > @@ -637,8 +645,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> > >  		return;
> > >  
> > >  	memset(&attr, 0, sizeof(struct perf_event_attr));
> > > -	attr.type = PERF_TYPE_RAW;
> > > -	attr.size = sizeof(attr);
> > 
> > Why is this line removed?
> 
> Typo on my part, thank you for spotting it.
> 
> > 
> > > [..]
> > > @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> > >  	init_irq_work(&vcpu->arch.pmu.overflow_work,
> > >  		      kvm_pmu_perf_overflow_notify_vcpu);
> > >  
> > > -	vcpu->arch.pmu.created = true;
> > > +	atomic_set(&vcpu->arch.pmu.created, 1);
> > > +
> > > +	kvm_for_each_vcpu(i, v, kvm) {
> > > +		if (!atomic_read(&v->arch.pmu.created))
> > > +			continue;
> > > +
> > > +		if (v->arch.pmu.arm_pmu != arm_pmu)
> > > +			return -ENXIO;
> > > +	}
> > 
> > If you did store the arm_pmu at the VM level, you wouldn't need this.
> > You could detect the discrepancy in the set_pmu ioctl.
> 
> I chose to set at the VCPU level to be consistent with how KVM treats the
> PMU interrupt ID when the interrupt is a PPI, where the interrupt ID must
> be the same for all VCPUs and it is stored at the VCPU. However, looking at
> the code again, it occurs to me that it is stored at the VCPU when it's a
> PPI because it's simpler to do it that way, as the code remains the same
> when the interrupt ID is a SPI, which must be *different* between VCPUs. So
> in the end, having the PMU stored at the VM level does match how KVM uses
> it, which looks to be better than my approach.
> 
> This is the change you proposed in your branch [1]:
> 
> +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> +{
> +       struct kvm *kvm = vcpu->kvm;
> +       struct arm_pmu_entry *entry;
> +       struct arm_pmu *arm_pmu;
> +       int ret = -ENXIO;
> +
> +       mutex_lock(&kvm->lock);
> +       mutex_lock(&arm_pmus_lock);
> +
> +       list_for_each_entry(entry, &arm_pmus, entry) {
> +               arm_pmu = entry->arm_pmu;
> +               if (arm_pmu->pmu.type == pmu_id) {
> +                       /* Can't change PMU if filters are already in place */
> +                       if (kvm->arch.arm_pmu != arm_pmu &&
> +                           kvm->arch.pmu_filter) {
> +                               ret = -EBUSY;
> +                               break;
> +                       }
> +
> +                       kvm->arch.arm_pmu = arm_pmu;
> +                       ret = 0;
> +                       break;
> +               }
> +       }
> +
> +       mutex_unlock(&arm_pmus_lock);
> +       mutex_unlock(&kvm->lock);
> +       return ret;
> +}
> 
> As I understand the code, userspace only needs to call
> KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) *once* (on one VCPU
> fd) to set the PMU for all the VCPUs; subsequent calls (on the same VCPU or
> on another VCPU) with a different PMU id will change the PMU for all VCPUs.
> 
> Two remarks:
> 
> 1. The documentation for the VCPU ioctls states this (from
> Documentation/virt/kvm/devices/vcpu.rst):
> 
> "
> ======================
> Generic vcpu interface
> ======================
> 
> The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
> KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
> kvm_device_attr as other devices, but **targets VCPU-wide settings and
> controls**" (emphasis added).
> 
> But I guess having VCPU ioctls affect *only* the VCPU hasn't really been
> true ever since PMU event filtering has been added. I'll send a patch to
> change that part of the documentation for arm64.
> 
> I was thinking maybe a VM capability would be better suited for changing a
> VM-wide setting, what do you think? I don't have a strong preference either
> way.

I'm not sure it is worth the hassle of changing the API, as we'll have
to keep the current one forever.

> 
> 2. What's to stop userspace to change the PMU after at least one VCPU has
> run? That can be easily observed by the guest when reading PMCEIDx_EL0.

That's a good point. We need something here. It is a bit odd as to do
that, you need to fully enable a PMU on one CPU, but not on the other,
then run the first while changing stuff on the other. Something along
those lines (untested):

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4bf28905d438..4f53520e84fd 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -139,6 +139,7 @@ struct kvm_arch {
 
 	/* Memory Tagging Extension enabled for the guest */
 	bool mte_enabled;
+	bool ran_once;
 };
 
 struct kvm_vcpu_fault_info {
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 83297fa97243..3045d7f609df 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -606,6 +606,10 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.has_run_once = true;
 
+	mutex_lock(&kvm->lock);
+	kvm->arch.ran_once = true;
+	mutex_unlock(&kvm->lock);
+
 	kvm_arm_vcpu_init_debug(vcpu);
 
 	if (likely(irqchip_in_kernel(kvm))) {
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index dfc0430d6418..95100c541244 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -959,8 +959,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
 		arm_pmu = entry->arm_pmu;
 		if (arm_pmu->pmu.type == pmu_id) {
 			/* Can't change PMU if filters are already in place */
-			if (kvm->arch.arm_pmu != arm_pmu &&
-			    kvm->arch.pmu_filter) {
+			if ((kvm->arch.arm_pmu != arm_pmu &&
+			     kvm->arch.pmu_filter) ||
+			    kvm->arch.ran_once) {
 				ret = -EBUSY;
 				break;
 			}
@@ -1040,6 +1041,11 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 
 		mutex_lock(&vcpu->kvm->lock);
 
+		if (vcpu->kvm->arch.ran_once) {
+			mutex_unlock(&vcpu->kvm->lock);
+			return -EBUSY;
+		}
+
 		if (!vcpu->kvm->arch.pmu_filter) {
 			vcpu->kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
 			if (!vcpu->kvm->arch.pmu_filter) {

which should prevent both PMU or filters to be changed once a single
vcpu as run.

Thoughts?

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2022-01-06 12:07     ` Alexandru Elisei
@ 2022-01-06 18:21       ` Marc Zyngier
  -1 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2022-01-06 18:21 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Thu, 06 Jan 2022 12:07:38 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Thu, Dec 30, 2021 at 08:01:10PM +0000, Marc Zyngier wrote:
> > Alex,
> > 
> > On Mon, 13 Dec 2021 15:23:05 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > (CC'ing Peter Maydell in case this might be of interest to qemu)
> > > 
> > > The series can be found on a branch at [1], and the kvmtool support at [2].
> > > The kvmtool patches are also on the mailing list [3] and haven't changed
> > > since v1.
> > > 
> > > Detailed explanation of the issue and symptoms that the patches attempt to
> > > correct can be found in the cover letter for v1 [4].
> > > 
> > > A summary of the problem is that on heterogeneous systems KVM will always
> > > use the same PMU for creating the VCPU events for *all* VCPUs regardless of
> > > the physical CPU on which the VCPU is running, leading to events suddenly
> > > stopping and resuming in the guest as the VCPU thread gets migrated across
> > > different CPUs.
> > > 
> > > This series proposes to fix this behaviour by allowing the user to specify
> > > which physical PMU is used when creating the VCPU events needed for guest
> > > PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > > physical which is not part of the supported CPUs for the specified PMU. The
> > > restriction is that all VCPUs must use the same PMU to avoid emulating an
> > > asymmetric platform.
> > > 
> > > The default behaviour stays the same - without userspace setting the PMU,
> > > events will stop counting if the VCPU is scheduled on the wrong CPU.
> > > 
> > > Tested with a hacked version of kvmtool that does the PMU initialization
> > > from the VCPU thread as opposed to from the main thread. Tested on
> > > rockpro64 by testing what happens when all VCPUs having the same PMU, one
> > > random VCPU having a different PMU than the other VCPUs and one random VCPU
> > > not having the PMU set (each test was run 1,000 times on the little cores
> > > and 1,000 times on the big cores).
> > > 
> > > Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
> > > not having a PMU set, and one random VCPU not having the PMU set; the VM
> > > had 64 threads in each of the tests and each test was run 10,000 times.
> > 
> > Came back to this series, and found more problems. On top of the
> > remarks I had earlier (the per-CPU data structures that really should
> > per VM, the disappearing attribute size), what happens when event
> > filters are already registered and that you set a specific PMU?
> 
> This is a good point. When I looked at how the PMU event filter works, I
> saw that KVM doesn't attempt to check that the events are actually
> implemented on the PMU, but somehow skipped over the fact that the PMU
> affects the total number of events available.

That, but also the meaning of the events. Switching PMU after
programmed event filters is really odd, as you don't know what you are
filtering anymore (unless you stick to purely architected events).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2022-01-06 18:21       ` Marc Zyngier
  0 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2022-01-06 18:21 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

On Thu, 06 Jan 2022 12:07:38 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Thu, Dec 30, 2021 at 08:01:10PM +0000, Marc Zyngier wrote:
> > Alex,
> > 
> > On Mon, 13 Dec 2021 15:23:05 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > (CC'ing Peter Maydell in case this might be of interest to qemu)
> > > 
> > > The series can be found on a branch at [1], and the kvmtool support at [2].
> > > The kvmtool patches are also on the mailing list [3] and haven't changed
> > > since v1.
> > > 
> > > Detailed explanation of the issue and symptoms that the patches attempt to
> > > correct can be found in the cover letter for v1 [4].
> > > 
> > > A summary of the problem is that on heterogeneous systems KVM will always
> > > use the same PMU for creating the VCPU events for *all* VCPUs regardless of
> > > the physical CPU on which the VCPU is running, leading to events suddenly
> > > stopping and resuming in the guest as the VCPU thread gets migrated across
> > > different CPUs.
> > > 
> > > This series proposes to fix this behaviour by allowing the user to specify
> > > which physical PMU is used when creating the VCPU events needed for guest
> > > PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > > physical which is not part of the supported CPUs for the specified PMU. The
> > > restriction is that all VCPUs must use the same PMU to avoid emulating an
> > > asymmetric platform.
> > > 
> > > The default behaviour stays the same - without userspace setting the PMU,
> > > events will stop counting if the VCPU is scheduled on the wrong CPU.
> > > 
> > > Tested with a hacked version of kvmtool that does the PMU initialization
> > > from the VCPU thread as opposed to from the main thread. Tested on
> > > rockpro64 by testing what happens when all VCPUs having the same PMU, one
> > > random VCPU having a different PMU than the other VCPUs and one random VCPU
> > > not having the PMU set (each test was run 1,000 times on the little cores
> > > and 1,000 times on the big cores).
> > > 
> > > Also tested on an Altra by testing all VCPUs having the same PMU, all VCPUs
> > > not having a PMU set, and one random VCPU not having the PMU set; the VM
> > > had 64 threads in each of the tests and each test was run 10,000 times.
> > 
> > Came back to this series, and found more problems. On top of the
> > remarks I had earlier (the per-CPU data structures that really should
> > per VM, the disappearing attribute size), what happens when event
> > filters are already registered and that you set a specific PMU?
> 
> This is a good point. When I looked at how the PMU event filter works, I
> saw that KVM doesn't attempt to check that the events are actually
> implemented on the PMU, but somehow skipped over the fact that the PMU
> affects the total number of events available.

That, but also the meaning of the events. Switching PMU after
programmed event filters is really odd, as you don't know what you are
filtering anymore (unless you stick to purely architected events).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2022-01-06 18:16         ` Marc Zyngier
@ 2022-01-07 11:08           ` Alexandru Elisei
  -1 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-07 11:08 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Marc,

On Thu, Jan 06, 2022 at 06:16:04PM +0000, Marc Zyngier wrote:
> On Thu, 06 Jan 2022 11:54:11 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On Tue, Dec 14, 2021 at 12:28:15PM +0000, Marc Zyngier wrote:
> > > On Mon, 13 Dec 2021 15:23:08 +0000,
> > > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > >
> > > > [..]
> > > >
> > > > @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> > > >  	init_irq_work(&vcpu->arch.pmu.overflow_work,
> > > >  		      kvm_pmu_perf_overflow_notify_vcpu);
> > > >  
> > > > -	vcpu->arch.pmu.created = true;
> > > > +	atomic_set(&vcpu->arch.pmu.created, 1);
> > > > +
> > > > +	kvm_for_each_vcpu(i, v, kvm) {
> > > > +		if (!atomic_read(&v->arch.pmu.created))
> > > > +			continue;
> > > > +
> > > > +		if (v->arch.pmu.arm_pmu != arm_pmu)
> > > > +			return -ENXIO;
> > > > +	}
> > > 
> > > If you did store the arm_pmu at the VM level, you wouldn't need this.
> > > You could detect the discrepancy in the set_pmu ioctl.
> > 
> > I chose to set at the VCPU level to be consistent with how KVM treats the
> > PMU interrupt ID when the interrupt is a PPI, where the interrupt ID must
> > be the same for all VCPUs and it is stored at the VCPU. However, looking at
> > the code again, it occurs to me that it is stored at the VCPU when it's a
> > PPI because it's simpler to do it that way, as the code remains the same
> > when the interrupt ID is a SPI, which must be *different* between VCPUs. So
> > in the end, having the PMU stored at the VM level does match how KVM uses
> > it, which looks to be better than my approach.
> > 
> > This is the change you proposed in your branch [1]:
> > 
> > +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > +{
> > +       struct kvm *kvm = vcpu->kvm;
> > +       struct arm_pmu_entry *entry;
> > +       struct arm_pmu *arm_pmu;
> > +       int ret = -ENXIO;
> > +
> > +       mutex_lock(&kvm->lock);
> > +       mutex_lock(&arm_pmus_lock);
> > +
> > +       list_for_each_entry(entry, &arm_pmus, entry) {
> > +               arm_pmu = entry->arm_pmu;
> > +               if (arm_pmu->pmu.type == pmu_id) {
> > +                       /* Can't change PMU if filters are already in place */
> > +                       if (kvm->arch.arm_pmu != arm_pmu &&
> > +                           kvm->arch.pmu_filter) {
> > +                               ret = -EBUSY;
> > +                               break;
> > +                       }
> > +
> > +                       kvm->arch.arm_pmu = arm_pmu;
> > +                       ret = 0;
> > +                       break;
> > +               }
> > +       }
> > +
> > +       mutex_unlock(&arm_pmus_lock);
> > +       mutex_unlock(&kvm->lock);
> > +       return ret;
> > +}
> > 
> > As I understand the code, userspace only needs to call
> > KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) *once* (on one VCPU
> > fd) to set the PMU for all the VCPUs; subsequent calls (on the same VCPU or
> > on another VCPU) with a different PMU id will change the PMU for all VCPUs.
> > 
> > Two remarks:
> > 
> > 1. The documentation for the VCPU ioctls states this (from
> > Documentation/virt/kvm/devices/vcpu.rst):
> > 
> > "
> > ======================
> > Generic vcpu interface
> > ======================
> > 
> > The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
> > KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
> > kvm_device_attr as other devices, but **targets VCPU-wide settings and
> > controls**" (emphasis added).
> > 
> > But I guess having VCPU ioctls affect *only* the VCPU hasn't really been
> > true ever since PMU event filtering has been added. I'll send a patch to
> > change that part of the documentation for arm64.
> > 
> > I was thinking maybe a VM capability would be better suited for changing a
> > VM-wide setting, what do you think? I don't have a strong preference either
> > way.
> 
> I'm not sure it is worth the hassle of changing the API, as we'll have
> to keep the current one forever.

I was suggesting to use a capability for setting the PMU, it's too late to
change how the events filter is set.

> 
> > 
> > 2. What's to stop userspace to change the PMU after at least one VCPU has
> > run? That can be easily observed by the guest when reading PMCEIDx_EL0.
> 
> That's a good point. We need something here. It is a bit odd as to do
> that, you need to fully enable a PMU on one CPU, but not on the other,
> then run the first while changing stuff on the other. Something along
> those lines (untested):
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 4bf28905d438..4f53520e84fd 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -139,6 +139,7 @@ struct kvm_arch {
>  
>  	/* Memory Tagging Extension enabled for the guest */
>  	bool mte_enabled;
> +	bool ran_once;
>  };
>  
>  struct kvm_vcpu_fault_info {
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 83297fa97243..3045d7f609df 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -606,6 +606,10 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  	vcpu->arch.has_run_once = true;
>  
> +	mutex_lock(&kvm->lock);
> +	kvm->arch.ran_once = true;
> +	mutex_unlock(&kvm->lock);
> +
>  	kvm_arm_vcpu_init_debug(vcpu);
>  
>  	if (likely(irqchip_in_kernel(kvm))) {
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index dfc0430d6418..95100c541244 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -959,8 +959,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
>  		arm_pmu = entry->arm_pmu;
>  		if (arm_pmu->pmu.type == pmu_id) {
>  			/* Can't change PMU if filters are already in place */
> -			if (kvm->arch.arm_pmu != arm_pmu &&
> -			    kvm->arch.pmu_filter) {
> +			if ((kvm->arch.arm_pmu != arm_pmu &&
> +			     kvm->arch.pmu_filter) ||
> +			    kvm->arch.ran_once) {
>  				ret = -EBUSY;
>  				break;
>  			}
> @@ -1040,6 +1041,11 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  
>  		mutex_lock(&vcpu->kvm->lock);
>  
> +		if (vcpu->kvm->arch.ran_once) {
> +			mutex_unlock(&vcpu->kvm->lock);
> +			return -EBUSY;
> +		}
> +
>  		if (!vcpu->kvm->arch.pmu_filter) {
>  			vcpu->kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
>  			if (!vcpu->kvm->arch.pmu_filter) {
> 
> which should prevent both PMU or filters to be changed once a single
> vcpu as run.
> 
> Thoughts?

Haven't tested it either, but it looks good to me. If you agree, I can pick
the diff, turn it into a patch and send it for the next iteration of this
series as a fix for the PMU events filter, while keeping your authorship.

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2022-01-07 11:08           ` Alexandru Elisei
  0 siblings, 0 replies; 32+ messages in thread
From: Alexandru Elisei @ 2022-01-07 11:08 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

Hi Marc,

On Thu, Jan 06, 2022 at 06:16:04PM +0000, Marc Zyngier wrote:
> On Thu, 06 Jan 2022 11:54:11 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On Tue, Dec 14, 2021 at 12:28:15PM +0000, Marc Zyngier wrote:
> > > On Mon, 13 Dec 2021 15:23:08 +0000,
> > > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > >
> > > > [..]
> > > >
> > > > @@ -910,7 +922,16 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> > > >  	init_irq_work(&vcpu->arch.pmu.overflow_work,
> > > >  		      kvm_pmu_perf_overflow_notify_vcpu);
> > > >  
> > > > -	vcpu->arch.pmu.created = true;
> > > > +	atomic_set(&vcpu->arch.pmu.created, 1);
> > > > +
> > > > +	kvm_for_each_vcpu(i, v, kvm) {
> > > > +		if (!atomic_read(&v->arch.pmu.created))
> > > > +			continue;
> > > > +
> > > > +		if (v->arch.pmu.arm_pmu != arm_pmu)
> > > > +			return -ENXIO;
> > > > +	}
> > > 
> > > If you did store the arm_pmu at the VM level, you wouldn't need this.
> > > You could detect the discrepancy in the set_pmu ioctl.
> > 
> > I chose to set at the VCPU level to be consistent with how KVM treats the
> > PMU interrupt ID when the interrupt is a PPI, where the interrupt ID must
> > be the same for all VCPUs and it is stored at the VCPU. However, looking at
> > the code again, it occurs to me that it is stored at the VCPU when it's a
> > PPI because it's simpler to do it that way, as the code remains the same
> > when the interrupt ID is a SPI, which must be *different* between VCPUs. So
> > in the end, having the PMU stored at the VM level does match how KVM uses
> > it, which looks to be better than my approach.
> > 
> > This is the change you proposed in your branch [1]:
> > 
> > +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > +{
> > +       struct kvm *kvm = vcpu->kvm;
> > +       struct arm_pmu_entry *entry;
> > +       struct arm_pmu *arm_pmu;
> > +       int ret = -ENXIO;
> > +
> > +       mutex_lock(&kvm->lock);
> > +       mutex_lock(&arm_pmus_lock);
> > +
> > +       list_for_each_entry(entry, &arm_pmus, entry) {
> > +               arm_pmu = entry->arm_pmu;
> > +               if (arm_pmu->pmu.type == pmu_id) {
> > +                       /* Can't change PMU if filters are already in place */
> > +                       if (kvm->arch.arm_pmu != arm_pmu &&
> > +                           kvm->arch.pmu_filter) {
> > +                               ret = -EBUSY;
> > +                               break;
> > +                       }
> > +
> > +                       kvm->arch.arm_pmu = arm_pmu;
> > +                       ret = 0;
> > +                       break;
> > +               }
> > +       }
> > +
> > +       mutex_unlock(&arm_pmus_lock);
> > +       mutex_unlock(&kvm->lock);
> > +       return ret;
> > +}
> > 
> > As I understand the code, userspace only needs to call
> > KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) *once* (on one VCPU
> > fd) to set the PMU for all the VCPUs; subsequent calls (on the same VCPU or
> > on another VCPU) with a different PMU id will change the PMU for all VCPUs.
> > 
> > Two remarks:
> > 
> > 1. The documentation for the VCPU ioctls states this (from
> > Documentation/virt/kvm/devices/vcpu.rst):
> > 
> > "
> > ======================
> > Generic vcpu interface
> > ======================
> > 
> > The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
> > KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
> > kvm_device_attr as other devices, but **targets VCPU-wide settings and
> > controls**" (emphasis added).
> > 
> > But I guess having VCPU ioctls affect *only* the VCPU hasn't really been
> > true ever since PMU event filtering has been added. I'll send a patch to
> > change that part of the documentation for arm64.
> > 
> > I was thinking maybe a VM capability would be better suited for changing a
> > VM-wide setting, what do you think? I don't have a strong preference either
> > way.
> 
> I'm not sure it is worth the hassle of changing the API, as we'll have
> to keep the current one forever.

I was suggesting to use a capability for setting the PMU, it's too late to
change how the events filter is set.

> 
> > 
> > 2. What's to stop userspace to change the PMU after at least one VCPU has
> > run? That can be easily observed by the guest when reading PMCEIDx_EL0.
> 
> That's a good point. We need something here. It is a bit odd as to do
> that, you need to fully enable a PMU on one CPU, but not on the other,
> then run the first while changing stuff on the other. Something along
> those lines (untested):
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 4bf28905d438..4f53520e84fd 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -139,6 +139,7 @@ struct kvm_arch {
>  
>  	/* Memory Tagging Extension enabled for the guest */
>  	bool mte_enabled;
> +	bool ran_once;
>  };
>  
>  struct kvm_vcpu_fault_info {
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 83297fa97243..3045d7f609df 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -606,6 +606,10 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  	vcpu->arch.has_run_once = true;
>  
> +	mutex_lock(&kvm->lock);
> +	kvm->arch.ran_once = true;
> +	mutex_unlock(&kvm->lock);
> +
>  	kvm_arm_vcpu_init_debug(vcpu);
>  
>  	if (likely(irqchip_in_kernel(kvm))) {
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index dfc0430d6418..95100c541244 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -959,8 +959,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
>  		arm_pmu = entry->arm_pmu;
>  		if (arm_pmu->pmu.type == pmu_id) {
>  			/* Can't change PMU if filters are already in place */
> -			if (kvm->arch.arm_pmu != arm_pmu &&
> -			    kvm->arch.pmu_filter) {
> +			if ((kvm->arch.arm_pmu != arm_pmu &&
> +			     kvm->arch.pmu_filter) ||
> +			    kvm->arch.ran_once) {
>  				ret = -EBUSY;
>  				break;
>  			}
> @@ -1040,6 +1041,11 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  
>  		mutex_lock(&vcpu->kvm->lock);
>  
> +		if (vcpu->kvm->arch.ran_once) {
> +			mutex_unlock(&vcpu->kvm->lock);
> +			return -EBUSY;
> +		}
> +
>  		if (!vcpu->kvm->arch.pmu_filter) {
>  			vcpu->kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
>  			if (!vcpu->kvm->arch.pmu_filter) {
> 
> which should prevent both PMU or filters to be changed once a single
> vcpu as run.
> 
> Thoughts?

Haven't tested it either, but it looks good to me. If you agree, I can pick
the diff, turn it into a patch and send it for the next iteration of this
series as a fix for the PMU events filter, while keeping your authorship.

Thanks,
Alex

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2022-01-07 11:08           ` Alexandru Elisei
@ 2022-01-07 14:35             ` Marc Zyngier
  -1 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2022-01-07 14:35 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Fri, 07 Jan 2022 11:08:05 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Thu, Jan 06, 2022 at 06:16:04PM +0000, Marc Zyngier wrote:
> > On Thu, 06 Jan 2022 11:54:11 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > 2. What's to stop userspace to change the PMU after at least one VCPU has
> > > run? That can be easily observed by the guest when reading PMCEIDx_EL0.
> > 
> > That's a good point. We need something here. It is a bit odd as to do
> > that, you need to fully enable a PMU on one CPU, but not on the other,
> > then run the first while changing stuff on the other. Something along
> > those lines (untested):
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 4bf28905d438..4f53520e84fd 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -139,6 +139,7 @@ struct kvm_arch {
> >  
> >  	/* Memory Tagging Extension enabled for the guest */
> >  	bool mte_enabled;
> > +	bool ran_once;
> >  };
> >  
> >  struct kvm_vcpu_fault_info {
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 83297fa97243..3045d7f609df 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -606,6 +606,10 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> >  
> >  	vcpu->arch.has_run_once = true;
> >  
> > +	mutex_lock(&kvm->lock);
> > +	kvm->arch.ran_once = true;
> > +	mutex_unlock(&kvm->lock);
> > +
> >  	kvm_arm_vcpu_init_debug(vcpu);
> >  
> >  	if (likely(irqchip_in_kernel(kvm))) {
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index dfc0430d6418..95100c541244 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -959,8 +959,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> >  		arm_pmu = entry->arm_pmu;
> >  		if (arm_pmu->pmu.type == pmu_id) {
> >  			/* Can't change PMU if filters are already in place */
> > -			if (kvm->arch.arm_pmu != arm_pmu &&
> > -			    kvm->arch.pmu_filter) {
> > +			if ((kvm->arch.arm_pmu != arm_pmu &&
> > +			     kvm->arch.pmu_filter) ||
> > +			    kvm->arch.ran_once) {
> >  				ret = -EBUSY;
> >  				break;
> >  			}
> > @@ -1040,6 +1041,11 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >  
> >  		mutex_lock(&vcpu->kvm->lock);
> >  
> > +		if (vcpu->kvm->arch.ran_once) {
> > +			mutex_unlock(&vcpu->kvm->lock);
> > +			return -EBUSY;
> > +		}
> > +
> >  		if (!vcpu->kvm->arch.pmu_filter) {
> >  			vcpu->kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
> >  			if (!vcpu->kvm->arch.pmu_filter) {
> > 
> > which should prevent both PMU or filters to be changed once a single
> > vcpu as run.
> > 
> > Thoughts?
> 
> Haven't tested it either, but it looks good to me. If you agree, I can pick
> the diff, turn it into a patch and send it for the next iteration of this
> series as a fix for the PMU events filter, while keeping your authorship.

Of course, please help yourself! :-)

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2022-01-07 14:35             ` Marc Zyngier
  0 siblings, 0 replies; 32+ messages in thread
From: Marc Zyngier @ 2022-01-07 14:35 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo, peter.maydell

On Fri, 07 Jan 2022 11:08:05 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Thu, Jan 06, 2022 at 06:16:04PM +0000, Marc Zyngier wrote:
> > On Thu, 06 Jan 2022 11:54:11 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > 2. What's to stop userspace to change the PMU after at least one VCPU has
> > > run? That can be easily observed by the guest when reading PMCEIDx_EL0.
> > 
> > That's a good point. We need something here. It is a bit odd as to do
> > that, you need to fully enable a PMU on one CPU, but not on the other,
> > then run the first while changing stuff on the other. Something along
> > those lines (untested):
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 4bf28905d438..4f53520e84fd 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -139,6 +139,7 @@ struct kvm_arch {
> >  
> >  	/* Memory Tagging Extension enabled for the guest */
> >  	bool mte_enabled;
> > +	bool ran_once;
> >  };
> >  
> >  struct kvm_vcpu_fault_info {
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 83297fa97243..3045d7f609df 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -606,6 +606,10 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> >  
> >  	vcpu->arch.has_run_once = true;
> >  
> > +	mutex_lock(&kvm->lock);
> > +	kvm->arch.ran_once = true;
> > +	mutex_unlock(&kvm->lock);
> > +
> >  	kvm_arm_vcpu_init_debug(vcpu);
> >  
> >  	if (likely(irqchip_in_kernel(kvm))) {
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index dfc0430d6418..95100c541244 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -959,8 +959,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> >  		arm_pmu = entry->arm_pmu;
> >  		if (arm_pmu->pmu.type == pmu_id) {
> >  			/* Can't change PMU if filters are already in place */
> > -			if (kvm->arch.arm_pmu != arm_pmu &&
> > -			    kvm->arch.pmu_filter) {
> > +			if ((kvm->arch.arm_pmu != arm_pmu &&
> > +			     kvm->arch.pmu_filter) ||
> > +			    kvm->arch.ran_once) {
> >  				ret = -EBUSY;
> >  				break;
> >  			}
> > @@ -1040,6 +1041,11 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >  
> >  		mutex_lock(&vcpu->kvm->lock);
> >  
> > +		if (vcpu->kvm->arch.ran_once) {
> > +			mutex_unlock(&vcpu->kvm->lock);
> > +			return -EBUSY;
> > +		}
> > +
> >  		if (!vcpu->kvm->arch.pmu_filter) {
> >  			vcpu->kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
> >  			if (!vcpu->kvm->arch.pmu_filter) {
> > 
> > which should prevent both PMU or filters to be changed once a single
> > vcpu as run.
> > 
> > Thoughts?
> 
> Haven't tested it either, but it looks good to me. If you agree, I can pick
> the diff, turn it into a patch and send it for the next iteration of this
> series as a fix for the PMU events filter, while keeping your authorship.

Of course, please help yourself! :-)

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-01-07 14:38 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-13 15:23 [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems Alexandru Elisei
2021-12-13 15:23 ` Alexandru Elisei
2021-12-13 15:23 ` [PATCH v3 1/4] perf: Fix wrong name in comment for struct perf_cpu_context Alexandru Elisei
2021-12-13 15:23   ` Alexandru Elisei
2021-12-13 15:23 ` [PATCH v3 2/4] KVM: arm64: Keep a list of probed PMUs Alexandru Elisei
2021-12-13 15:23   ` Alexandru Elisei
2021-12-14  7:23   ` Reiji Watanabe
2021-12-14  7:23     ` Reiji Watanabe
2021-12-14 12:30   ` Marc Zyngier
2021-12-14 12:30     ` Marc Zyngier
2022-01-06 11:46     ` Alexandru Elisei
2022-01-06 11:46       ` Alexandru Elisei
2021-12-13 15:23 ` [PATCH v3 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute Alexandru Elisei
2021-12-13 15:23   ` Alexandru Elisei
2021-12-14 12:28   ` Marc Zyngier
2021-12-14 12:28     ` Marc Zyngier
2022-01-06 11:54     ` Alexandru Elisei
2022-01-06 11:54       ` Alexandru Elisei
2022-01-06 18:16       ` Marc Zyngier
2022-01-06 18:16         ` Marc Zyngier
2022-01-07 11:08         ` Alexandru Elisei
2022-01-07 11:08           ` Alexandru Elisei
2022-01-07 14:35           ` Marc Zyngier
2022-01-07 14:35             ` Marc Zyngier
2021-12-13 15:23 ` [PATCH v3 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU Alexandru Elisei
2021-12-13 15:23   ` Alexandru Elisei
2021-12-30 20:01 ` [PATCH v3 0/4] KVM: arm64: Improve PMU support on heterogeneous systems Marc Zyngier
2021-12-30 20:01   ` Marc Zyngier
2022-01-06 12:07   ` Alexandru Elisei
2022-01-06 12:07     ` Alexandru Elisei
2022-01-06 18:21     ` Marc Zyngier
2022-01-06 18:21       ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.