[PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-06 17:02 ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

(CC'ing Peter Maydell in case this might be of interest to qemu)

The series can be found on a branch at [1], and the kvmtool support at [2].
The kvmtool patches are also on the mailing list [3] and haven't changed
since v1.

Detailed explanation of the issue and symptoms that the patches attempt to
correct can be found in the cover letter for v1 [4].

A brief summary of the problem is that on heterogeneous systems KVM will
always use the same PMU for creating the VCPU events for *all* VCPUs
regardless of the physical CPU on which the VCPU is running, leading to
events suddenly stopping and resuming in the guest as the VCPU thread gets
migrated across different CPUs.

This series proposes to fix this behaviour by allowing the user to specify
which physical PMU is used when creating the VCPU events needed for guest
PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
physical which is not part of the supported CPUs for the specified PMU.

The default behaviour stays the same - without userspace setting the PMU,
events will stop counting if the VCPU is scheduled on the wrong CPU.

Changes since v1:

- Rebased on top of v5.16-rc4

- Implemented review comments: protect iterating through the list of PMUs
  with a mutex, documentation changes, initialize vcpu-arch.supported_cpus
  to cpu_possible_mask, changed vcpu->arch.cpu_not_supported to a VCPU
  flag, set exit reason to KVM_EXIT_FAIL_ENTRY and populate fail_entry when
  the VCPU is run on a CPU not in the PMU's supported cpumask. Many thanks
  for the review!

[1] https://gitlab.arm.com/linux-arm/linux-ae/-/tree/pmu-big-little-fix-v2
[2] https://gitlab.arm.com/linux-arm/kvmtool-ae/-/tree/pmu-big-little-fix-v1
[3] https://www.spinics.net/lists/arm-kernel/msg933584.html
[4] https://www.spinics.net/lists/arm-kernel/msg933579.html

Alexandru Elisei (4):
  perf: Fix wrong name in comment for struct perf_cpu_context
  KVM: arm64: Keep a list of probed PMUs
  KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical
    CPU

 Documentation/virt/kvm/devices/vcpu.rst | 29 +++++++++++
 arch/arm64/include/asm/kvm_host.h       | 12 +++++
 arch/arm64/include/uapi/asm/kvm.h       |  4 ++
 arch/arm64/kvm/arm.c                    | 19 ++++++++
 arch/arm64/kvm/pmu-emul.c               | 64 +++++++++++++++++++++++--
 include/kvm/arm_pmu.h                   |  6 +++
 include/linux/perf_event.h              |  2 +-
 tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
 8 files changed, 132 insertions(+), 5 deletions(-)

-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-06 17:02 ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: peter.maydell, tglx, mingo

(CC'ing Peter Maydell in case this might be of interest to qemu)

The series can be found on a branch at [1], and the kvmtool support at [2].
The kvmtool patches are also on the mailing list [3] and haven't changed
since v1.

Detailed explanation of the issue and symptoms that the patches attempt to
correct can be found in the cover letter for v1 [4].

A brief summary of the problem is that on heterogeneous systems KVM will
always use the same PMU for creating the VCPU events for *all* VCPUs
regardless of the physical CPU on which the VCPU is running, leading to
events suddenly stopping and resuming in the guest as the VCPU thread gets
migrated across different CPUs.

This series proposes to fix this behaviour by allowing the user to specify
which physical PMU is used when creating the VCPU events needed for guest
PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
physical which is not part of the supported CPUs for the specified PMU.

The default behaviour stays the same - without userspace setting the PMU,
events will stop counting if the VCPU is scheduled on the wrong CPU.

Changes since v1:

- Rebased on top of v5.16-rc4

- Implemented review comments: protect iterating through the list of PMUs
  with a mutex, documentation changes, initialize vcpu-arch.supported_cpus
  to cpu_possible_mask, changed vcpu->arch.cpu_not_supported to a VCPU
  flag, set exit reason to KVM_EXIT_FAIL_ENTRY and populate fail_entry when
  the VCPU is run on a CPU not in the PMU's supported cpumask. Many thanks
  for the review!

[1] https://gitlab.arm.com/linux-arm/linux-ae/-/tree/pmu-big-little-fix-v2
[2] https://gitlab.arm.com/linux-arm/kvmtool-ae/-/tree/pmu-big-little-fix-v1
[3] https://www.spinics.net/lists/arm-kernel/msg933584.html
[4] https://www.spinics.net/lists/arm-kernel/msg933579.html

Alexandru Elisei (4):
  perf: Fix wrong name in comment for struct perf_cpu_context
  KVM: arm64: Keep a list of probed PMUs
  KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical
    CPU

 Documentation/virt/kvm/devices/vcpu.rst | 29 +++++++++++
 arch/arm64/include/asm/kvm_host.h       | 12 +++++
 arch/arm64/include/uapi/asm/kvm.h       |  4 ++
 arch/arm64/kvm/arm.c                    | 19 ++++++++
 arch/arm64/kvm/pmu-emul.c               | 64 +++++++++++++++++++++++--
 include/kvm/arm_pmu.h                   |  6 +++
 include/linux/perf_event.h              |  2 +-
 tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
 8 files changed, 132 insertions(+), 5 deletions(-)

-- 
2.34.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 1/4] perf: Fix wrong name in comment for struct perf_cpu_context
  2021-12-06 17:02 ` Alexandru Elisei
@ 2021-12-06 17:02   ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

Commit 0793a61d4df8 ("performance counters: core code") added the perf
subsystem (then called Performance Counters) to Linux, creating the struct
perf_cpu_context. The comment for the struct referred to it as a "struct
perf_counter_cpu_context".

Commit cdd6c482c9ff ("perf: Do the big rename: Performance Counters ->
Performance Events") changed the comment to refer to a "struct
perf_event_cpu_context", which was still the wrong name for the struct.

Change the comment to say "struct perf_cpu_context".

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 include/linux/perf_event.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0dcfd265beed..14132570ea5d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -862,7 +862,7 @@ struct perf_event_context {
 #define PERF_NR_CONTEXTS	4
 
 /**
- * struct perf_event_cpu_context - per cpu event context structure
+ * struct perf_cpu_context - per cpu event context structure
  */
 struct perf_cpu_context {
 	struct perf_event_context	ctx;
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v2 1/4] perf: Fix wrong name in comment for struct perf_cpu_context
@ 2021-12-06 17:02   ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: peter.maydell, tglx, mingo

Commit 0793a61d4df8 ("performance counters: core code") added the perf
subsystem (then called Performance Counters) to Linux, creating the struct
perf_cpu_context. The comment for the struct referred to it as a "struct
perf_counter_cpu_context".

Commit cdd6c482c9ff ("perf: Do the big rename: Performance Counters ->
Performance Events") changed the comment to refer to a "struct
perf_event_cpu_context", which was still the wrong name for the struct.

Change the comment to say "struct perf_cpu_context".

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 include/linux/perf_event.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0dcfd265beed..14132570ea5d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -862,7 +862,7 @@ struct perf_event_context {
 #define PERF_NR_CONTEXTS	4
 
 /**
- * struct perf_event_cpu_context - per cpu event context structure
+ * struct perf_cpu_context - per cpu event context structure
  */
 struct perf_cpu_context {
 	struct perf_event_context	ctx;
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v2 2/4] KVM: arm64: Keep a list of probed PMUs
  2021-12-06 17:02 ` Alexandru Elisei
@ 2021-12-06 17:02   ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arch/arm64/kvm/pmu-emul.c | 26 ++++++++++++++++++++++++--
 include/kvm/arm_pmu.h     |  5 +++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index a5e4bbf5e68f..eaaad4c06561 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -7,13 +7,18 @@
 #include <linux/cpu.h>
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
+#include <linux/list.h>
 #include <linux/perf_event.h>
 #include <linux/perf/arm_pmu.h>
+#include <linux/rwsem.h>
 #include <linux/uaccess.h>
 #include <asm/kvm_emulate.h>
 #include <kvm/arm_pmu.h>
 #include <kvm/arm_vgic.h>
 
+static LIST_HEAD(arm_pmus);
+static DEFINE_MUTEX(arm_pmus_lock);
+
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
@@ -742,9 +747,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 
 void kvm_host_pmu_init(struct arm_pmu *pmu)
 {
-	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
-	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
+	struct arm_pmu_entry *entry;
+
+	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
+	    is_protected_kvm_enabled())
+		return;
+
+	mutex_lock(&arm_pmus_lock);
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		goto out_unlock;
+
+	if (list_empty(&arm_pmus))
 		static_branch_enable(&kvm_arm_pmu_available);
+
+	entry->arm_pmu = pmu;
+	list_add_tail(&entry->entry, &arm_pmus);
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
 }
 
 static int kvm_pmu_probe_pmuver(void)
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 90f21898aad8..e249c5f172aa 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -36,6 +36,11 @@ struct kvm_pmu {
 	struct irq_work overflow_work;
 };
 
+struct arm_pmu_entry {
+	struct list_head entry;
+	struct arm_pmu *arm_pmu;
+};
+
 #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
 u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
 void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v2 2/4] KVM: arm64: Keep a list of probed PMUs
@ 2021-12-06 17:02   ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: peter.maydell, tglx, mingo

The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arch/arm64/kvm/pmu-emul.c | 26 ++++++++++++++++++++++++--
 include/kvm/arm_pmu.h     |  5 +++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index a5e4bbf5e68f..eaaad4c06561 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -7,13 +7,18 @@
 #include <linux/cpu.h>
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
+#include <linux/list.h>
 #include <linux/perf_event.h>
 #include <linux/perf/arm_pmu.h>
+#include <linux/rwsem.h>
 #include <linux/uaccess.h>
 #include <asm/kvm_emulate.h>
 #include <kvm/arm_pmu.h>
 #include <kvm/arm_vgic.h>
 
+static LIST_HEAD(arm_pmus);
+static DEFINE_MUTEX(arm_pmus_lock);
+
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx);
 static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, struct kvm_pmc *pmc);
@@ -742,9 +747,26 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 
 void kvm_host_pmu_init(struct arm_pmu *pmu)
 {
-	if (pmu->pmuver != 0 && pmu->pmuver != ID_AA64DFR0_PMUVER_IMP_DEF &&
-	    !kvm_arm_support_pmu_v3() && !is_protected_kvm_enabled())
+	struct arm_pmu_entry *entry;
+
+	if (pmu->pmuver == 0 || pmu->pmuver == ID_AA64DFR0_PMUVER_IMP_DEF ||
+	    is_protected_kvm_enabled())
+		return;
+
+	mutex_lock(&arm_pmus_lock);
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		goto out_unlock;
+
+	if (list_empty(&arm_pmus))
 		static_branch_enable(&kvm_arm_pmu_available);
+
+	entry->arm_pmu = pmu;
+	list_add_tail(&entry->entry, &arm_pmus);
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
 }
 
 static int kvm_pmu_probe_pmuver(void)
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 90f21898aad8..e249c5f172aa 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -36,6 +36,11 @@ struct kvm_pmu {
 	struct irq_work overflow_work;
 };
 
+struct arm_pmu_entry {
+	struct list_head entry;
+	struct arm_pmu *arm_pmu;
+};
+
 #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
 u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
 void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-06 17:02 ` Alexandru Elisei
@ 2021-12-06 17:02   ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

When KVM creates an event and there are more than one PMUs present on the
system, perf_init_event() will go through the list of available PMUs and
will choose the first one that can create the event. The order of the PMUs
in the PMU list depends on the probe order, which can change under various
circumstances, for example if the order of the PMU nodes change in the DTB
or if asynchronous driver probing is enabled on the kernel command line
(with the driver_async_probe=armv8-pmu option).

Another consequence of this approach is that, on heteregeneous systems,
all virtual machines that KVM creates will use the same PMU. This might
cause unexpected behaviour for userspace: when a VCPU is executing on
the physical CPU that uses this PMU, PMU events in the guest work
correctly; but when the same VCPU executes on another CPU, PMU events in
the guest will suddenly stop counting.

Fortunately, perf core allows user to specify on which PMU to create an
event by using the perf_event_attr->type field, which is used by
perf_init_event() as an index in the radix tree of available PMUs.

Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
attribute to allow userspace to specify the arm_pmu that KVM will use when
creating events for that VCPU. KVM will make no attempt to run the VCPU on
the physical CPUs that share this PMU, leaving it up to userspace to
manage the VCPU threads' affinity accordingly.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
 arch/arm64/include/uapi/asm/kvm.h       |  1 +
 arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
 include/kvm/arm_pmu.h                   |  1 +
 tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
 5 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 60a29972d3f1..c82be5cbc268 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
 isn't strictly speaking an event. Filtering the cycle counter is possible
 using event 0x11 (CPU_CYCLES).
 
+1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
+------------------------------------------
+
+:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
+             identifier.
+
+:Returns:
+
+	 =======  ===============================================
+	 -EBUSY   PMUv3 already initialized
+	 -EFAULT  Error accessing the PMU identifier
+	 -ENXIO   PMU not found
+	 -ENODEV  PMUv3 not supported or GIC not initialized
+	 -ENOMEM  Could not allocate memory
+	 =======  ===============================================
+
+Request that the VCPU uses the specified hardware PMU when creating guest events
+for the purpose of PMU emulation. The PMU identifier can be read from the "type"
+file for the desired PMU instance under /sys/devices (or, equivalent,
+/sys/bus/even_source). This attribute is particularly useful on heterogeneous
+systems where there are at least two CPU PMUs on the system.
+
+Note that KVM will not make any attempts to run the VCPU on the physical CPUs
+associated with the PMU specified by this attribute. This is entirely left to
+userspace.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index eaaad4c06561..618138c5f792 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 {
 	struct kvm_pmu *pmu = &vcpu->arch.pmu;
+	struct arm_pmu *arm_pmu = pmu->arm_pmu;
 	struct kvm_pmc *pmc;
 	struct perf_event *event;
 	struct perf_event_attr attr;
@@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 		return;
 
 	memset(&attr, 0, sizeof(struct perf_event_attr));
-	attr.type = PERF_TYPE_RAW;
-	attr.size = sizeof(attr);
+	attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
 	attr.pinned = 1;
 	attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
 	attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
@@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
 	return true;
 }
 
+static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
+{
+	struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
+	struct arm_pmu_entry *entry;
+	struct arm_pmu *arm_pmu;
+	int ret = -ENXIO;
+
+	mutex_lock(&arm_pmus_lock);
+
+	list_for_each_entry(entry, &arm_pmus, entry) {
+		arm_pmu = entry->arm_pmu;
+		if (arm_pmu->pmu.type == pmu_id) {
+			kvm_pmu->arm_pmu = arm_pmu;
+			ret = 0;
+			goto out_unlock;
+		}
+	}
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
+	return ret;
+}
+
 int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 {
 	if (!kvm_vcpu_has_pmu(vcpu))
@@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 
 		return 0;
 	}
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
+		int __user *uaddr = (int __user *)(long)attr->addr;
+		int pmu_id;
+
+		if (get_user(pmu_id, uaddr))
+			return -EFAULT;
+
+		return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
+	}
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 		return kvm_arm_pmu_v3_init(vcpu);
 	}
@@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 	case KVM_ARM_VCPU_PMU_V3_IRQ:
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 	case KVM_ARM_VCPU_PMU_V3_FILTER:
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU:
 		if (kvm_vcpu_has_pmu(vcpu))
 			return 0;
 	}
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index e249c5f172aa..ab3046a8f9bb 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -34,6 +34,7 @@ struct kvm_pmu {
 	bool created;
 	bool irq_level;
 	struct irq_work overflow_work;
+	struct arm_pmu *arm_pmu;
 };
 
 struct arm_pmu_entry {
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-06 17:02   ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: peter.maydell, tglx, mingo

When KVM creates an event and there are more than one PMUs present on the
system, perf_init_event() will go through the list of available PMUs and
will choose the first one that can create the event. The order of the PMUs
in the PMU list depends on the probe order, which can change under various
circumstances, for example if the order of the PMU nodes change in the DTB
or if asynchronous driver probing is enabled on the kernel command line
(with the driver_async_probe=armv8-pmu option).

Another consequence of this approach is that, on heteregeneous systems,
all virtual machines that KVM creates will use the same PMU. This might
cause unexpected behaviour for userspace: when a VCPU is executing on
the physical CPU that uses this PMU, PMU events in the guest work
correctly; but when the same VCPU executes on another CPU, PMU events in
the guest will suddenly stop counting.

Fortunately, perf core allows user to specify on which PMU to create an
event by using the perf_event_attr->type field, which is used by
perf_init_event() as an index in the radix tree of available PMUs.

Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
attribute to allow userspace to specify the arm_pmu that KVM will use when
creating events for that VCPU. KVM will make no attempt to run the VCPU on
the physical CPUs that share this PMU, leaving it up to userspace to
manage the VCPU threads' affinity accordingly.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
 arch/arm64/include/uapi/asm/kvm.h       |  1 +
 arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
 include/kvm/arm_pmu.h                   |  1 +
 tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
 5 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 60a29972d3f1..c82be5cbc268 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
 isn't strictly speaking an event. Filtering the cycle counter is possible
 using event 0x11 (CPU_CYCLES).
 
+1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
+------------------------------------------
+
+:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
+             identifier.
+
+:Returns:
+
+	 =======  ===============================================
+	 -EBUSY   PMUv3 already initialized
+	 -EFAULT  Error accessing the PMU identifier
+	 -ENXIO   PMU not found
+	 -ENODEV  PMUv3 not supported or GIC not initialized
+	 -ENOMEM  Could not allocate memory
+	 =======  ===============================================
+
+Request that the VCPU uses the specified hardware PMU when creating guest events
+for the purpose of PMU emulation. The PMU identifier can be read from the "type"
+file for the desired PMU instance under /sys/devices (or, equivalent,
+/sys/bus/even_source). This attribute is particularly useful on heterogeneous
+systems where there are at least two CPU PMUs on the system.
+
+Note that KVM will not make any attempts to run the VCPU on the physical CPUs
+associated with the PMU specified by this attribute. This is entirely left to
+userspace.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index eaaad4c06561..618138c5f792 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
 static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 {
 	struct kvm_pmu *pmu = &vcpu->arch.pmu;
+	struct arm_pmu *arm_pmu = pmu->arm_pmu;
 	struct kvm_pmc *pmc;
 	struct perf_event *event;
 	struct perf_event_attr attr;
@@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 		return;
 
 	memset(&attr, 0, sizeof(struct perf_event_attr));
-	attr.type = PERF_TYPE_RAW;
-	attr.size = sizeof(attr);
+	attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
 	attr.pinned = 1;
 	attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
 	attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
@@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
 	return true;
 }
 
+static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
+{
+	struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
+	struct arm_pmu_entry *entry;
+	struct arm_pmu *arm_pmu;
+	int ret = -ENXIO;
+
+	mutex_lock(&arm_pmus_lock);
+
+	list_for_each_entry(entry, &arm_pmus, entry) {
+		arm_pmu = entry->arm_pmu;
+		if (arm_pmu->pmu.type == pmu_id) {
+			kvm_pmu->arm_pmu = arm_pmu;
+			ret = 0;
+			goto out_unlock;
+		}
+	}
+
+out_unlock:
+	mutex_unlock(&arm_pmus_lock);
+	return ret;
+}
+
 int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 {
 	if (!kvm_vcpu_has_pmu(vcpu))
@@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 
 		return 0;
 	}
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
+		int __user *uaddr = (int __user *)(long)attr->addr;
+		int pmu_id;
+
+		if (get_user(pmu_id, uaddr))
+			return -EFAULT;
+
+		return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
+	}
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 		return kvm_arm_pmu_v3_init(vcpu);
 	}
@@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 	case KVM_ARM_VCPU_PMU_V3_IRQ:
 	case KVM_ARM_VCPU_PMU_V3_INIT:
 	case KVM_ARM_VCPU_PMU_V3_FILTER:
+	case KVM_ARM_VCPU_PMU_V3_SET_PMU:
 		if (kvm_vcpu_has_pmu(vcpu))
 			return 0;
 	}
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index e249c5f172aa..ab3046a8f9bb 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -34,6 +34,7 @@ struct kvm_pmu {
 	bool created;
 	bool irq_level;
 	struct irq_work overflow_work;
+	struct arm_pmu *arm_pmu;
 };
 
 struct arm_pmu_entry {
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..1d0a0a2a9711 100644
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
 #define   KVM_ARM_VCPU_PMU_V3_INIT	1
 #define   KVM_ARM_VCPU_PMU_V3_FILTER	2
+#define   KVM_ARM_VCPU_PMU_V3_SET_PMU	3
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-06 17:02 ` Alexandru Elisei
@ 2021-12-06 17:02   ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
device ioctl. If the VCPU is scheduled on a physical CPU which has a
different PMU, the perf events needed to emulate a guest PMU won't be
scheduled in and the guest performance counters will stop counting. Treat
it as an userspace error and refuse to run the VCPU in this situation.

The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
the flag is cleared when the KVM_RUN enters the non-preemptible section
instead of in vcpu_put(); this has been done on purpose so the error
condition is communicated as soon as possible to userspace, otherwise
vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
I agonized for hours about the best name for the VCPU flag and the
accessors. If someone has a better idea, please tell me and I'll change
them.

 Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
 arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
 arch/arm64/include/uapi/asm/kvm.h       |  3 +++
 arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
 arch/arm64/kvm/pmu-emul.c               |  1 +
 5 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index c82be5cbc268..9ae47b7c3652 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
 
 Note that KVM will not make any attempts to run the VCPU on the physical CPUs
 associated with the PMU specified by this attribute. This is entirely left to
-userspace.
+userspace. However, attempting to run the VCPU on a physical CPU not supported
+by the PMU will fail and KVM_RUN will return with
+exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
+hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
+the cpu field to the processor id.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2a5f7f38006f..0c453f2e48b6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
 		u64 last_steal;
 		gpa_t base;
 	} steal;
+
+	cpumask_var_t supported_cpus;
 };
 
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
@@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
 #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
 #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
+#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
 
 #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
 				 KVM_GUESTDBG_USE_SW_BP | \
@@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
 #define vcpu_has_ptrauth(vcpu)		false
 #endif
 
+#define vcpu_on_unsupported_cpu(vcpu)					\
+	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_set_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_clear_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
+
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
 
 /*
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 1d0a0a2a9711..d49f714f48e6 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
 #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
 #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
 
+/* run->fail_entry.hardware_entry_failure_reason codes. */
+#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
+
 #endif
 
 #endif /* __ARM_KVM_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e4727dc771bf..1124c3efdd94 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
 
+	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
+		return -ENOMEM;
+	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
+
 	/* Set up the timer */
 	kvm_timer_vcpu_init(vcpu);
 
@@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
 		static_branch_dec(&userspace_irqchip_in_use);
 
+	free_cpumask_var(vcpu->arch.supported_cpus);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
 	kvm_timer_vcpu_terminate(vcpu);
 	kvm_pmu_vcpu_destroy(vcpu);
@@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (vcpu_has_ptrauth(vcpu))
 		vcpu_ptrauth_disable(vcpu);
 	kvm_arch_vcpu_load_debug_state_flags(vcpu);
+
+	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
+		vcpu_set_on_unsupported_cpu(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
@@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		 */
 		preempt_disable();
 
+		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
+			vcpu_clear_on_unsupported_cpu(vcpu);
+			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+			run->fail_entry.hardware_entry_failure_reason
+				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
+			run->fail_entry.cpu = smp_processor_id();
+			ret = 0;
+			preempt_enable();
+			break;
+		}
+
 		kvm_pmu_flush_hwstate(vcpu);
 
 		local_irq_disable();
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 618138c5f792..471fe0f734ed 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
 		arm_pmu = entry->arm_pmu;
 		if (arm_pmu->pmu.type == pmu_id) {
 			kvm_pmu->arm_pmu = arm_pmu;
+			cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
 			ret = 0;
 			goto out_unlock;
 		}
-- 
2.34.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-06 17:02   ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-06 17:02 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: peter.maydell, tglx, mingo

Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
device ioctl. If the VCPU is scheduled on a physical CPU which has a
different PMU, the perf events needed to emulate a guest PMU won't be
scheduled in and the guest performance counters will stop counting. Treat
it as an userspace error and refuse to run the VCPU in this situation.

The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
the flag is cleared when the KVM_RUN enters the non-preemptible section
instead of in vcpu_put(); this has been done on purpose so the error
condition is communicated as soon as possible to userspace, otherwise
vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
I agonized for hours about the best name for the VCPU flag and the
accessors. If someone has a better idea, please tell me and I'll change
them.

 Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
 arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
 arch/arm64/include/uapi/asm/kvm.h       |  3 +++
 arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
 arch/arm64/kvm/pmu-emul.c               |  1 +
 5 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index c82be5cbc268..9ae47b7c3652 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
 
 Note that KVM will not make any attempts to run the VCPU on the physical CPUs
 associated with the PMU specified by this attribute. This is entirely left to
-userspace.
+userspace. However, attempting to run the VCPU on a physical CPU not supported
+by the PMU will fail and KVM_RUN will return with
+exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
+hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
+the cpu field to the processor id.
 
 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
 =================================
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2a5f7f38006f..0c453f2e48b6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
 		u64 last_steal;
 		gpa_t base;
 	} steal;
+
+	cpumask_var_t supported_cpus;
 };
 
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
@@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
 #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
 #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
+#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
 
 #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
 				 KVM_GUESTDBG_USE_SW_BP | \
@@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
 #define vcpu_has_ptrauth(vcpu)		false
 #endif
 
+#define vcpu_on_unsupported_cpu(vcpu)					\
+	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_set_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
+
+#define vcpu_clear_on_unsupported_cpu(vcpu)				\
+	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
+
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
 
 /*
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 1d0a0a2a9711..d49f714f48e6 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
 #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
 #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
 
+/* run->fail_entry.hardware_entry_failure_reason codes. */
+#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
+
 #endif
 
 #endif /* __ARM_KVM_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e4727dc771bf..1124c3efdd94 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
 
+	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
+		return -ENOMEM;
+	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
+
 	/* Set up the timer */
 	kvm_timer_vcpu_init(vcpu);
 
@@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
 		static_branch_dec(&userspace_irqchip_in_use);
 
+	free_cpumask_var(vcpu->arch.supported_cpus);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
 	kvm_timer_vcpu_terminate(vcpu);
 	kvm_pmu_vcpu_destroy(vcpu);
@@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (vcpu_has_ptrauth(vcpu))
 		vcpu_ptrauth_disable(vcpu);
 	kvm_arch_vcpu_load_debug_state_flags(vcpu);
+
+	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
+		vcpu_set_on_unsupported_cpu(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
@@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		 */
 		preempt_disable();
 
+		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
+			vcpu_clear_on_unsupported_cpu(vcpu);
+			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+			run->fail_entry.hardware_entry_failure_reason
+				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
+			run->fail_entry.cpu = smp_processor_id();
+			ret = 0;
+			preempt_enable();
+			break;
+		}
+
 		kvm_pmu_flush_hwstate(vcpu);
 
 		local_irq_disable();
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 618138c5f792..471fe0f734ed 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
 		arm_pmu = entry->arm_pmu;
 		if (arm_pmu->pmu.type == pmu_id) {
 			kvm_pmu->arm_pmu = arm_pmu;
+			cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
 			ret = 0;
 			goto out_unlock;
 		}
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-06 17:02   ` Alexandru Elisei
@ 2021-12-07 14:17     ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-07 14:17 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

Hi,

On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> device ioctl. If the VCPU is scheduled on a physical CPU which has a
> different PMU, the perf events needed to emulate a guest PMU won't be
> scheduled in and the guest performance counters will stop counting. Treat
> it as an userspace error and refuse to run the VCPU in this situation.
> 
> The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> the flag is cleared when the KVM_RUN enters the non-preemptible section
> instead of in vcpu_put(); this has been done on purpose so the error
> condition is communicated as soon as possible to userspace, otherwise
> vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
> I agonized for hours about the best name for the VCPU flag and the
> accessors. If someone has a better idea, please tell me and I'll change
> them.
> 
>  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
>  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
>  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
>  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
>  arch/arm64/kvm/pmu-emul.c               |  1 +
>  5 files changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index c82be5cbc268..9ae47b7c3652 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
>  
>  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
>  associated with the PMU specified by this attribute. This is entirely left to
> -userspace.
> +userspace. However, attempting to run the VCPU on a physical CPU not supported
> +by the PMU will fail and KVM_RUN will return with
> +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> +the cpu field to the processor id.
>  
>  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
>  =================================
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 2a5f7f38006f..0c453f2e48b6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
>  		u64 last_steal;
>  		gpa_t base;
>  	} steal;
> +
> +	cpumask_var_t supported_cpus;
>  };
>  
>  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
>  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
>  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
> +#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
>  
>  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
>  				 KVM_GUESTDBG_USE_SW_BP | \
> @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
>  #define vcpu_has_ptrauth(vcpu)		false
>  #endif
>  
> +#define vcpu_on_unsupported_cpu(vcpu)					\
> +	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> +
> +#define vcpu_set_on_unsupported_cpu(vcpu)				\
> +	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> +
> +#define vcpu_clear_on_unsupported_cpu(vcpu)				\
> +	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> +
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
>  
>  /*
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 1d0a0a2a9711..d49f714f48e6 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
>  #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
>  #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
>  
> +/* run->fail_entry.hardware_entry_failure_reason codes. */
> +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
> +
>  #endif
>  
>  #endif /* __ARM_KVM_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e4727dc771bf..1124c3efdd94 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  
>  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>  
> +	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> +		return -ENOMEM;
> +	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> +
>  	/* Set up the timer */
>  	kvm_timer_vcpu_init(vcpu);
>  
> @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
>  		static_branch_dec(&userspace_irqchip_in_use);
>  
> +	free_cpumask_var(vcpu->arch.supported_cpus);
>  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>  	kvm_timer_vcpu_terminate(vcpu);
>  	kvm_pmu_vcpu_destroy(vcpu);
> @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	if (vcpu_has_ptrauth(vcpu))
>  		vcpu_ptrauth_disable(vcpu);
>  	kvm_arch_vcpu_load_debug_state_flags(vcpu);
> +
> +	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> +		vcpu_set_on_unsupported_cpu(vcpu);
>  }
>  
>  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  		 */
>  		preempt_disable();
>  
> +		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> +			vcpu_clear_on_unsupported_cpu(vcpu);
> +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> +			run->fail_entry.hardware_entry_failure_reason
> +				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> +			run->fail_entry.cpu = smp_processor_id();

I just realised that this is wrong for the same reason that KVM doesn't
clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
after the vcpu_load that set the flag and before preemption is disabled
could mean that now the thread is executing on a different physical CPU
than the physical CPU that caused the flag to be set. To make things worse,
this CPU might even be in supported_cpus, which would be extremely
confusing for someone trying to descipher what went wrong.

I see three solutions here:

1. Drop setting the fail_entry.cpu field.

2. Make vcpu_put clear the flag, which means that if the flag is set here
then the VCPU is definitely executing on the wrong physical CPU and
smp_processor_id() will be useful.

3. Carry the unsupported CPU ID information in a new field in struct
kvm_vcpu_arch.

I honestly don't have a preference. Maybe slightly towards solution number
2, as it makes the code symmetrical and removes the subtletly around when
the VCPU flag is cleared. But this would be done at the expense of
userspace possibly finding out a lot later (or never) that something went
wrong.

Thoughts?

Thanks,
Alex

> +			ret = 0;
> +			preempt_enable();
> +			break;
> +		}
> +
>  		kvm_pmu_flush_hwstate(vcpu);
>  
>  		local_irq_disable();
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 618138c5f792..471fe0f734ed 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
>  		arm_pmu = entry->arm_pmu;
>  		if (arm_pmu->pmu.type == pmu_id) {
>  			kvm_pmu->arm_pmu = arm_pmu;
> +			cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
>  			ret = 0;
>  			goto out_unlock;
>  		}
> -- 
> 2.34.1
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-07 14:17     ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-07 14:17 UTC (permalink / raw)
  To: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm
  Cc: tglx, mingo

Hi,

On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> device ioctl. If the VCPU is scheduled on a physical CPU which has a
> different PMU, the perf events needed to emulate a guest PMU won't be
> scheduled in and the guest performance counters will stop counting. Treat
> it as an userspace error and refuse to run the VCPU in this situation.
> 
> The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> the flag is cleared when the KVM_RUN enters the non-preemptible section
> instead of in vcpu_put(); this has been done on purpose so the error
> condition is communicated as soon as possible to userspace, otherwise
> vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
> I agonized for hours about the best name for the VCPU flag and the
> accessors. If someone has a better idea, please tell me and I'll change
> them.
> 
>  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
>  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
>  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
>  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
>  arch/arm64/kvm/pmu-emul.c               |  1 +
>  5 files changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index c82be5cbc268..9ae47b7c3652 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
>  
>  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
>  associated with the PMU specified by this attribute. This is entirely left to
> -userspace.
> +userspace. However, attempting to run the VCPU on a physical CPU not supported
> +by the PMU will fail and KVM_RUN will return with
> +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> +the cpu field to the processor id.
>  
>  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
>  =================================
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 2a5f7f38006f..0c453f2e48b6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
>  		u64 last_steal;
>  		gpa_t base;
>  	} steal;
> +
> +	cpumask_var_t supported_cpus;
>  };
>  
>  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
>  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
>  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
> +#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
>  
>  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
>  				 KVM_GUESTDBG_USE_SW_BP | \
> @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
>  #define vcpu_has_ptrauth(vcpu)		false
>  #endif
>  
> +#define vcpu_on_unsupported_cpu(vcpu)					\
> +	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> +
> +#define vcpu_set_on_unsupported_cpu(vcpu)				\
> +	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> +
> +#define vcpu_clear_on_unsupported_cpu(vcpu)				\
> +	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> +
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
>  
>  /*
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 1d0a0a2a9711..d49f714f48e6 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
>  #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
>  #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
>  
> +/* run->fail_entry.hardware_entry_failure_reason codes. */
> +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
> +
>  #endif
>  
>  #endif /* __ARM_KVM_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e4727dc771bf..1124c3efdd94 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  
>  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>  
> +	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> +		return -ENOMEM;
> +	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> +
>  	/* Set up the timer */
>  	kvm_timer_vcpu_init(vcpu);
>  
> @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
>  		static_branch_dec(&userspace_irqchip_in_use);
>  
> +	free_cpumask_var(vcpu->arch.supported_cpus);
>  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>  	kvm_timer_vcpu_terminate(vcpu);
>  	kvm_pmu_vcpu_destroy(vcpu);
> @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	if (vcpu_has_ptrauth(vcpu))
>  		vcpu_ptrauth_disable(vcpu);
>  	kvm_arch_vcpu_load_debug_state_flags(vcpu);
> +
> +	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> +		vcpu_set_on_unsupported_cpu(vcpu);
>  }
>  
>  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  		 */
>  		preempt_disable();
>  
> +		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> +			vcpu_clear_on_unsupported_cpu(vcpu);
> +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> +			run->fail_entry.hardware_entry_failure_reason
> +				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> +			run->fail_entry.cpu = smp_processor_id();

I just realised that this is wrong for the same reason that KVM doesn't
clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
after the vcpu_load that set the flag and before preemption is disabled
could mean that now the thread is executing on a different physical CPU
than the physical CPU that caused the flag to be set. To make things worse,
this CPU might even be in supported_cpus, which would be extremely
confusing for someone trying to descipher what went wrong.

I see three solutions here:

1. Drop setting the fail_entry.cpu field.

2. Make vcpu_put clear the flag, which means that if the flag is set here
then the VCPU is definitely executing on the wrong physical CPU and
smp_processor_id() will be useful.

3. Carry the unsupported CPU ID information in a new field in struct
kvm_vcpu_arch.

I honestly don't have a preference. Maybe slightly towards solution number
2, as it makes the code symmetrical and removes the subtletly around when
the VCPU flag is cleared. But this would be done at the expense of
userspace possibly finding out a lot later (or never) that something went
wrong.

Thoughts?

Thanks,
Alex

> +			ret = 0;
> +			preempt_enable();
> +			break;
> +		}
> +
>  		kvm_pmu_flush_hwstate(vcpu);
>  
>  		local_irq_disable();
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 618138c5f792..471fe0f734ed 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
>  		arm_pmu = entry->arm_pmu;
>  		if (arm_pmu->pmu.type == pmu_id) {
>  			kvm_pmu->arm_pmu = arm_pmu;
> +			cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
>  			ret = 0;
>  			goto out_unlock;
>  		}
> -- 
> 2.34.1
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-06 17:02 ` Alexandru Elisei
@ 2021-12-08  2:36   ` Reiji Watanabe
  -1 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-08  2:36 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Alex,

On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> (CC'ing Peter Maydell in case this might be of interest to qemu)
>
> The series can be found on a branch at [1], and the kvmtool support at [2].
> The kvmtool patches are also on the mailing list [3] and haven't changed
> since v1.
>
> Detailed explanation of the issue and symptoms that the patches attempt to
> correct can be found in the cover letter for v1 [4].
>
> A brief summary of the problem is that on heterogeneous systems KVM will
> always use the same PMU for creating the VCPU events for *all* VCPUs
> regardless of the physical CPU on which the VCPU is running, leading to
> events suddenly stopping and resuming in the guest as the VCPU thread gets
> migrated across different CPUs.
>
> This series proposes to fix this behaviour by allowing the user to specify
> which physical PMU is used when creating the VCPU events needed for guest
> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> physical which is not part of the supported CPUs for the specified PMU.

Just to confirm, this series provides an API for userspace to request
KVM to detect a wrong affinity setting due to a userspace bug so that
userspace can get an error at KVM_RUN instead of leading to events
suddenly stopping, correct ?


> The default behaviour stays the same - without userspace setting the PMU,
> events will stop counting if the VCPU is scheduled on the wrong CPU.

Can't we fix the default behavior (in addition to the current fix) ?
(Do we need to maintain the default behavior ??)
IMHO I feel it is better to prevent userspace from configuring PMU
for guests on such heterogeneous systems rather than leading to
events suddenly stopping even as the default behavior.

Thanks,
Reiji


>
> Changes since v1:
>
> - Rebased on top of v5.16-rc4
>
> - Implemented review comments: protect iterating through the list of PMUs
>   with a mutex, documentation changes, initialize vcpu-arch.supported_cpus
>   to cpu_possible_mask, changed vcpu->arch.cpu_not_supported to a VCPU
>   flag, set exit reason to KVM_EXIT_FAIL_ENTRY and populate fail_entry when
>   the VCPU is run on a CPU not in the PMU's supported cpumask. Many thanks
>   for the review!
>
> [1] https://gitlab.arm.com/linux-arm/linux-ae/-/tree/pmu-big-little-fix-v2
> [2] https://gitlab.arm.com/linux-arm/kvmtool-ae/-/tree/pmu-big-little-fix-v1
> [3] https://www.spinics.net/lists/arm-kernel/msg933584.html
> [4] https://www.spinics.net/lists/arm-kernel/msg933579.html
>
> Alexandru Elisei (4):
>   perf: Fix wrong name in comment for struct perf_cpu_context
>   KVM: arm64: Keep a list of probed PMUs
>   KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
>   KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical
>     CPU
>
>  Documentation/virt/kvm/devices/vcpu.rst | 29 +++++++++++
>  arch/arm64/include/asm/kvm_host.h       | 12 +++++
>  arch/arm64/include/uapi/asm/kvm.h       |  4 ++
>  arch/arm64/kvm/arm.c                    | 19 ++++++++
>  arch/arm64/kvm/pmu-emul.c               | 64 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h                   |  6 +++
>  include/linux/perf_event.h              |  2 +-
>  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
>  8 files changed, 132 insertions(+), 5 deletions(-)
>
> --
> 2.34.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-08  2:36   ` Reiji Watanabe
  0 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-08  2:36 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Alex,

On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> (CC'ing Peter Maydell in case this might be of interest to qemu)
>
> The series can be found on a branch at [1], and the kvmtool support at [2].
> The kvmtool patches are also on the mailing list [3] and haven't changed
> since v1.
>
> Detailed explanation of the issue and symptoms that the patches attempt to
> correct can be found in the cover letter for v1 [4].
>
> A brief summary of the problem is that on heterogeneous systems KVM will
> always use the same PMU for creating the VCPU events for *all* VCPUs
> regardless of the physical CPU on which the VCPU is running, leading to
> events suddenly stopping and resuming in the guest as the VCPU thread gets
> migrated across different CPUs.
>
> This series proposes to fix this behaviour by allowing the user to specify
> which physical PMU is used when creating the VCPU events needed for guest
> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> physical which is not part of the supported CPUs for the specified PMU.

Just to confirm, this series provides an API for userspace to request
KVM to detect a wrong affinity setting due to a userspace bug so that
userspace can get an error at KVM_RUN instead of leading to events
suddenly stopping, correct ?


> The default behaviour stays the same - without userspace setting the PMU,
> events will stop counting if the VCPU is scheduled on the wrong CPU.

Can't we fix the default behavior (in addition to the current fix) ?
(Do we need to maintain the default behavior ??)
IMHO I feel it is better to prevent userspace from configuring PMU
for guests on such heterogeneous systems rather than leading to
events suddenly stopping even as the default behavior.

Thanks,
Reiji


>
> Changes since v1:
>
> - Rebased on top of v5.16-rc4
>
> - Implemented review comments: protect iterating through the list of PMUs
>   with a mutex, documentation changes, initialize vcpu-arch.supported_cpus
>   to cpu_possible_mask, changed vcpu->arch.cpu_not_supported to a VCPU
>   flag, set exit reason to KVM_EXIT_FAIL_ENTRY and populate fail_entry when
>   the VCPU is run on a CPU not in the PMU's supported cpumask. Many thanks
>   for the review!
>
> [1] https://gitlab.arm.com/linux-arm/linux-ae/-/tree/pmu-big-little-fix-v2
> [2] https://gitlab.arm.com/linux-arm/kvmtool-ae/-/tree/pmu-big-little-fix-v1
> [3] https://www.spinics.net/lists/arm-kernel/msg933584.html
> [4] https://www.spinics.net/lists/arm-kernel/msg933579.html
>
> Alexandru Elisei (4):
>   perf: Fix wrong name in comment for struct perf_cpu_context
>   KVM: arm64: Keep a list of probed PMUs
>   KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
>   KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical
>     CPU
>
>  Documentation/virt/kvm/devices/vcpu.rst | 29 +++++++++++
>  arch/arm64/include/asm/kvm_host.h       | 12 +++++
>  arch/arm64/include/uapi/asm/kvm.h       |  4 ++
>  arch/arm64/kvm/arm.c                    | 19 ++++++++
>  arch/arm64/kvm/pmu-emul.c               | 64 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h                   |  6 +++
>  include/linux/perf_event.h              |  2 +-
>  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
>  8 files changed, 132 insertions(+), 5 deletions(-)
>
> --
> 2.34.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-06 17:02   ` Alexandru Elisei
@ 2021-12-08  3:13     ` Reiji Watanabe
  -1 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-08  3:13 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Alex,

On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> When KVM creates an event and there are more than one PMUs present on the
> system, perf_init_event() will go through the list of available PMUs and
> will choose the first one that can create the event. The order of the PMUs
> in the PMU list depends on the probe order, which can change under various
> circumstances, for example if the order of the PMU nodes change in the DTB
> or if asynchronous driver probing is enabled on the kernel command line
> (with the driver_async_probe=armv8-pmu option).
>
> Another consequence of this approach is that, on heteregeneous systems,
> all virtual machines that KVM creates will use the same PMU. This might
> cause unexpected behaviour for userspace: when a VCPU is executing on
> the physical CPU that uses this PMU, PMU events in the guest work
> correctly; but when the same VCPU executes on another CPU, PMU events in
> the guest will suddenly stop counting.
>
> Fortunately, perf core allows user to specify on which PMU to create an
> event by using the perf_event_attr->type field, which is used by
> perf_init_event() as an index in the radix tree of available PMUs.
>
> Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> attribute to allow userspace to specify the arm_pmu that KVM will use when
> creating events for that VCPU. KVM will make no attempt to run the VCPU on
> the physical CPUs that share this PMU, leaving it up to userspace to
> manage the VCPU threads' affinity accordingly.
>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
>  arch/arm64/include/uapi/asm/kvm.h       |  1 +
>  arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h                   |  1 +
>  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
>  5 files changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index 60a29972d3f1..c82be5cbc268 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
>  isn't strictly speaking an event. Filtering the cycle counter is possible
>  using event 0x11 (CPU_CYCLES).
>
> +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> +------------------------------------------
> +
> +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> +             identifier.
> +
> +:Returns:
> +
> +        =======  ===============================================
> +        -EBUSY   PMUv3 already initialized
> +        -EFAULT  Error accessing the PMU identifier
> +        -ENXIO   PMU not found
> +        -ENODEV  PMUv3 not supported or GIC not initialized
> +        -ENOMEM  Could not allocate memory
> +        =======  ===============================================
> +
> +Request that the VCPU uses the specified hardware PMU when creating guest events
> +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> +file for the desired PMU instance under /sys/devices (or, equivalent,
> +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> +systems where there are at least two CPU PMUs on the system.
> +
> +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> +associated with the PMU specified by this attribute. This is entirely left to
> +userspace.
>
>  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
>  =================================
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..1d0a0a2a9711 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
>  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
>  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
>  #define KVM_ARM_VCPU_TIMER_CTRL                1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index eaaad4c06561..618138c5f792 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  {
>         struct kvm_pmu *pmu = &vcpu->arch.pmu;
> +       struct arm_pmu *arm_pmu = pmu->arm_pmu;
>         struct kvm_pmc *pmc;
>         struct perf_event *event;
>         struct perf_event_attr attr;
> @@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>                 return;
>
>         memset(&attr, 0, sizeof(struct perf_event_attr));
> -       attr.type = PERF_TYPE_RAW;
> -       attr.size = sizeof(attr);
> +       attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
>         attr.pinned = 1;
>         attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
>         attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> @@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
>         return true;
>  }
>
> +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> +{
> +       struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
> +       struct arm_pmu_entry *entry;
> +       struct arm_pmu *arm_pmu;
> +       int ret = -ENXIO;
> +
> +       mutex_lock(&arm_pmus_lock);
> +
> +       list_for_each_entry(entry, &arm_pmus, entry) {
> +               arm_pmu = entry->arm_pmu;
> +               if (arm_pmu->pmu.type == pmu_id) {
> +                       kvm_pmu->arm_pmu = arm_pmu;

Shouldn't kvm->arch.pmuver be updated based on the pmu that
is used for the guest ?

Thanks,
Reiji


> +                       ret = 0;
> +                       goto out_unlock;
> +               }
> +       }
> +
> +out_unlock:
> +       mutex_unlock(&arm_pmus_lock);
> +       return ret;
> +}
> +
>  int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  {
>         if (!kvm_vcpu_has_pmu(vcpu))
> @@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>
>                 return 0;
>         }
> +       case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
> +               int __user *uaddr = (int __user *)(long)attr->addr;
> +               int pmu_id;
> +
> +               if (get_user(pmu_id, uaddr))
> +                       return -EFAULT;
> +
> +               return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
> +       }
>         case KVM_ARM_VCPU_PMU_V3_INIT:
>                 return kvm_arm_pmu_v3_init(vcpu);
>         }
> @@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>         case KVM_ARM_VCPU_PMU_V3_IRQ:
>         case KVM_ARM_VCPU_PMU_V3_INIT:
>         case KVM_ARM_VCPU_PMU_V3_FILTER:
> +       case KVM_ARM_VCPU_PMU_V3_SET_PMU:
>                 if (kvm_vcpu_has_pmu(vcpu))
>                         return 0;
>         }
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index e249c5f172aa..ab3046a8f9bb 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -34,6 +34,7 @@ struct kvm_pmu {
>         bool created;
>         bool irq_level;
>         struct irq_work overflow_work;
> +       struct arm_pmu *arm_pmu;
>  };
>
>  struct arm_pmu_entry {
> diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..1d0a0a2a9711 100644
> --- a/tools/arch/arm64/include/uapi/asm/kvm.h
> +++ b/tools/arch/arm64/include/uapi/asm/kvm.h
> @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
>  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
>  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
>  #define KVM_ARM_VCPU_TIMER_CTRL                1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> --
> 2.34.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08  3:13     ` Reiji Watanabe
  0 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-08  3:13 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Alex,

On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> When KVM creates an event and there are more than one PMUs present on the
> system, perf_init_event() will go through the list of available PMUs and
> will choose the first one that can create the event. The order of the PMUs
> in the PMU list depends on the probe order, which can change under various
> circumstances, for example if the order of the PMU nodes change in the DTB
> or if asynchronous driver probing is enabled on the kernel command line
> (with the driver_async_probe=armv8-pmu option).
>
> Another consequence of this approach is that, on heteregeneous systems,
> all virtual machines that KVM creates will use the same PMU. This might
> cause unexpected behaviour for userspace: when a VCPU is executing on
> the physical CPU that uses this PMU, PMU events in the guest work
> correctly; but when the same VCPU executes on another CPU, PMU events in
> the guest will suddenly stop counting.
>
> Fortunately, perf core allows user to specify on which PMU to create an
> event by using the perf_event_attr->type field, which is used by
> perf_init_event() as an index in the radix tree of available PMUs.
>
> Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> attribute to allow userspace to specify the arm_pmu that KVM will use when
> creating events for that VCPU. KVM will make no attempt to run the VCPU on
> the physical CPUs that share this PMU, leaving it up to userspace to
> manage the VCPU threads' affinity accordingly.
>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
>  arch/arm64/include/uapi/asm/kvm.h       |  1 +
>  arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
>  include/kvm/arm_pmu.h                   |  1 +
>  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
>  5 files changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index 60a29972d3f1..c82be5cbc268 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
>  isn't strictly speaking an event. Filtering the cycle counter is possible
>  using event 0x11 (CPU_CYCLES).
>
> +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> +------------------------------------------
> +
> +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> +             identifier.
> +
> +:Returns:
> +
> +        =======  ===============================================
> +        -EBUSY   PMUv3 already initialized
> +        -EFAULT  Error accessing the PMU identifier
> +        -ENXIO   PMU not found
> +        -ENODEV  PMUv3 not supported or GIC not initialized
> +        -ENOMEM  Could not allocate memory
> +        =======  ===============================================
> +
> +Request that the VCPU uses the specified hardware PMU when creating guest events
> +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> +file for the desired PMU instance under /sys/devices (or, equivalent,
> +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> +systems where there are at least two CPU PMUs on the system.
> +
> +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> +associated with the PMU specified by this attribute. This is entirely left to
> +userspace.
>
>  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
>  =================================
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..1d0a0a2a9711 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
>  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
>  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
>  #define KVM_ARM_VCPU_TIMER_CTRL                1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index eaaad4c06561..618138c5f792 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
>  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>  {
>         struct kvm_pmu *pmu = &vcpu->arch.pmu;
> +       struct arm_pmu *arm_pmu = pmu->arm_pmu;
>         struct kvm_pmc *pmc;
>         struct perf_event *event;
>         struct perf_event_attr attr;
> @@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
>                 return;
>
>         memset(&attr, 0, sizeof(struct perf_event_attr));
> -       attr.type = PERF_TYPE_RAW;
> -       attr.size = sizeof(attr);
> +       attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
>         attr.pinned = 1;
>         attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
>         attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> @@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
>         return true;
>  }
>
> +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> +{
> +       struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
> +       struct arm_pmu_entry *entry;
> +       struct arm_pmu *arm_pmu;
> +       int ret = -ENXIO;
> +
> +       mutex_lock(&arm_pmus_lock);
> +
> +       list_for_each_entry(entry, &arm_pmus, entry) {
> +               arm_pmu = entry->arm_pmu;
> +               if (arm_pmu->pmu.type == pmu_id) {
> +                       kvm_pmu->arm_pmu = arm_pmu;

Shouldn't kvm->arch.pmuver be updated based on the pmu that
is used for the guest ?

Thanks,
Reiji


> +                       ret = 0;
> +                       goto out_unlock;
> +               }
> +       }
> +
> +out_unlock:
> +       mutex_unlock(&arm_pmus_lock);
> +       return ret;
> +}
> +
>  int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  {
>         if (!kvm_vcpu_has_pmu(vcpu))
> @@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>
>                 return 0;
>         }
> +       case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
> +               int __user *uaddr = (int __user *)(long)attr->addr;
> +               int pmu_id;
> +
> +               if (get_user(pmu_id, uaddr))
> +                       return -EFAULT;
> +
> +               return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
> +       }
>         case KVM_ARM_VCPU_PMU_V3_INIT:
>                 return kvm_arm_pmu_v3_init(vcpu);
>         }
> @@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>         case KVM_ARM_VCPU_PMU_V3_IRQ:
>         case KVM_ARM_VCPU_PMU_V3_INIT:
>         case KVM_ARM_VCPU_PMU_V3_FILTER:
> +       case KVM_ARM_VCPU_PMU_V3_SET_PMU:
>                 if (kvm_vcpu_has_pmu(vcpu))
>                         return 0;
>         }
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index e249c5f172aa..ab3046a8f9bb 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -34,6 +34,7 @@ struct kvm_pmu {
>         bool created;
>         bool irq_level;
>         struct irq_work overflow_work;
> +       struct arm_pmu *arm_pmu;
>  };
>
>  struct arm_pmu_entry {
> diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..1d0a0a2a9711 100644
> --- a/tools/arch/arm64/include/uapi/asm/kvm.h
> +++ b/tools/arch/arm64/include/uapi/asm/kvm.h
> @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
>  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
>  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
>  #define KVM_ARM_VCPU_TIMER_CTRL                1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> --
> 2.34.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-07 14:17     ` Alexandru Elisei
@ 2021-12-08  7:54       ` Reiji Watanabe
  -1 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-08  7:54 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Alex,

On Tue, Dec 7, 2021 at 6:18 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi,
>
> On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > different PMU, the perf events needed to emulate a guest PMU won't be
> > scheduled in and the guest performance counters will stop counting. Treat
> > it as an userspace error and refuse to run the VCPU in this situation.
> >
> > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > instead of in vcpu_put(); this has been done on purpose so the error
> > condition is communicated as soon as possible to userspace, otherwise
> > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> >
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> > I agonized for hours about the best name for the VCPU flag and the
> > accessors. If someone has a better idea, please tell me and I'll change
> > them.
> >
> >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> >  arch/arm64/kvm/pmu-emul.c               |  1 +
> >  5 files changed, 40 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > index c82be5cbc268..9ae47b7c3652 100644
> > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> >
> >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> >  associated with the PMU specified by this attribute. This is entirely left to
> > -userspace.
> > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > +by the PMU will fail and KVM_RUN will return with
> > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > +the cpu field to the processor id.
> >
> >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> >  =================================
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 2a5f7f38006f..0c453f2e48b6 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> >               u64 last_steal;
> >               gpa_t base;
> >       } steal;
> > +
> > +     cpumask_var_t supported_cpus;
> >  };
> >
> >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_EXCEPT_MASK                (7 << 9) /* Target EL/MODE */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE       (1 << 12) /* Save SPE context if active  */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE      (1 << 13) /* Save TRBE context if active  */
> > +#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 14) /* Physical CPU not in supported_cpus */
> >
> >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> >                                KVM_GUESTDBG_USE_SW_BP | \
> > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> >  #define vcpu_has_ptrauth(vcpu)               false
> >  #endif
> >
> > +#define vcpu_on_unsupported_cpu(vcpu)                                        \
> > +     ((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_set_on_unsupported_cpu(vcpu)                            \
> > +     ((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_clear_on_unsupported_cpu(vcpu)                          \
> > +     ((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> >  #define vcpu_gp_regs(v)              (&(v)->arch.ctxt.regs)
> >
> >  /*
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 1d0a0a2a9711..d49f714f48e6 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> >  #define KVM_PSCI_RET_INVAL           PSCI_RET_INVALID_PARAMS
> >  #define KVM_PSCI_RET_DENIED          PSCI_RET_DENIED
> >
> > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED  (1ULL << 0)
> > +
> >  #endif
> >
> >  #endif /* __ARM_KVM_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..1124c3efdd94 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >
> >       vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> >
> > +     if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > +             return -ENOMEM;

It appears that vcpu->arch.supported_cpus needs to be freed
if kvm_arch_vcpu_create() fails after it is allocated.
(kvm_vgic_vcpu_init() or create_hyp_mappings() might fail)


> > +     cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> > +
> >       /* Set up the timer */
> >       kvm_timer_vcpu_init(vcpu);
> >
> > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> >       if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> >               static_branch_dec(&userspace_irqchip_in_use);
> >
> > +     free_cpumask_var(vcpu->arch.supported_cpus);
> >       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> >       kvm_timer_vcpu_terminate(vcpu);
> >       kvm_pmu_vcpu_destroy(vcpu);
> > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >       if (vcpu_has_ptrauth(vcpu))
> >               vcpu_ptrauth_disable(vcpu);
> >       kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > +
> > +     if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > +             vcpu_set_on_unsupported_cpu(vcpu);
> >  }
> >
> >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >                */
> >               preempt_disable();
> >
> > +             if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > +                     vcpu_clear_on_unsupported_cpu(vcpu);
> > +                     run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > +                     run->fail_entry.hardware_entry_failure_reason
> > +                             = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > +                     run->fail_entry.cpu = smp_processor_id();
>
> I just realised that this is wrong for the same reason that KVM doesn't
> clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> after the vcpu_load that set the flag and before preemption is disabled
> could mean that now the thread is executing on a different physical CPU
> than the physical CPU that caused the flag to be set. To make things worse,
> this CPU might even be in supported_cpus, which would be extremely
> confusing for someone trying to descipher what went wrong.
>
> I see three solutions here:
>
> 1. Drop setting the fail_entry.cpu field.
>
> 2. Make vcpu_put clear the flag, which means that if the flag is set here
> then the VCPU is definitely executing on the wrong physical CPU and
> smp_processor_id() will be useful.
>
> 3. Carry the unsupported CPU ID information in a new field in struct
> kvm_vcpu_arch.
>
> I honestly don't have a preference. Maybe slightly towards solution number
> 2, as it makes the code symmetrical and removes the subtletly around when
> the VCPU flag is cleared. But this would be done at the expense of
> userspace possibly finding out a lot later (or never) that something went
> wrong.
>
> Thoughts?

IMHO, I would prefer 2, which is symmetrical and straightforward,
out of those three options.  Unless KVM checks the thread's CPU
affinity, userspace possibly finds that out a lot later anyway.

BTW, kvm_vcpu_pmu_restore_guest/kvm_vcpu_pmu_restore_host, which
are (indirectly) called from vcpu_load/vcpu_put, seems to attempt
to read/writes pmccfiltr_el0, which is present only when FEAT_PMUv3
is implemented, even if the current CPU does not support FEAT_PMUv3.

Thanks,
Reiji


>
> > +                     ret = 0;
> > +                     preempt_enable();
> > +                     break;
> > +             }
> > +
> >               kvm_pmu_flush_hwstate(vcpu);
> >
> >               local_irq_disable();
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index 618138c5f792..471fe0f734ed 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> >               arm_pmu = entry->arm_pmu;
> >               if (arm_pmu->pmu.type == pmu_id) {
> >                       kvm_pmu->arm_pmu = arm_pmu;
> > +                     cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
> >                       ret = 0;
> >                       goto out_unlock;
> >               }
> > --
> > 2.34.1
> >
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-08  7:54       ` Reiji Watanabe
  0 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-08  7:54 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Alex,

On Tue, Dec 7, 2021 at 6:18 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi,
>
> On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > different PMU, the perf events needed to emulate a guest PMU won't be
> > scheduled in and the guest performance counters will stop counting. Treat
> > it as an userspace error and refuse to run the VCPU in this situation.
> >
> > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > instead of in vcpu_put(); this has been done on purpose so the error
> > condition is communicated as soon as possible to userspace, otherwise
> > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> >
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> > I agonized for hours about the best name for the VCPU flag and the
> > accessors. If someone has a better idea, please tell me and I'll change
> > them.
> >
> >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> >  arch/arm64/kvm/pmu-emul.c               |  1 +
> >  5 files changed, 40 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > index c82be5cbc268..9ae47b7c3652 100644
> > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> >
> >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> >  associated with the PMU specified by this attribute. This is entirely left to
> > -userspace.
> > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > +by the PMU will fail and KVM_RUN will return with
> > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > +the cpu field to the processor id.
> >
> >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> >  =================================
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 2a5f7f38006f..0c453f2e48b6 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> >               u64 last_steal;
> >               gpa_t base;
> >       } steal;
> > +
> > +     cpumask_var_t supported_cpus;
> >  };
> >
> >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_EXCEPT_MASK                (7 << 9) /* Target EL/MODE */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE       (1 << 12) /* Save SPE context if active  */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE      (1 << 13) /* Save TRBE context if active  */
> > +#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 14) /* Physical CPU not in supported_cpus */
> >
> >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> >                                KVM_GUESTDBG_USE_SW_BP | \
> > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> >  #define vcpu_has_ptrauth(vcpu)               false
> >  #endif
> >
> > +#define vcpu_on_unsupported_cpu(vcpu)                                        \
> > +     ((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_set_on_unsupported_cpu(vcpu)                            \
> > +     ((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_clear_on_unsupported_cpu(vcpu)                          \
> > +     ((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> >  #define vcpu_gp_regs(v)              (&(v)->arch.ctxt.regs)
> >
> >  /*
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 1d0a0a2a9711..d49f714f48e6 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> >  #define KVM_PSCI_RET_INVAL           PSCI_RET_INVALID_PARAMS
> >  #define KVM_PSCI_RET_DENIED          PSCI_RET_DENIED
> >
> > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED  (1ULL << 0)
> > +
> >  #endif
> >
> >  #endif /* __ARM_KVM_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..1124c3efdd94 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >
> >       vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> >
> > +     if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > +             return -ENOMEM;

It appears that vcpu->arch.supported_cpus needs to be freed
if kvm_arch_vcpu_create() fails after it is allocated.
(kvm_vgic_vcpu_init() or create_hyp_mappings() might fail)


> > +     cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> > +
> >       /* Set up the timer */
> >       kvm_timer_vcpu_init(vcpu);
> >
> > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> >       if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> >               static_branch_dec(&userspace_irqchip_in_use);
> >
> > +     free_cpumask_var(vcpu->arch.supported_cpus);
> >       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> >       kvm_timer_vcpu_terminate(vcpu);
> >       kvm_pmu_vcpu_destroy(vcpu);
> > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >       if (vcpu_has_ptrauth(vcpu))
> >               vcpu_ptrauth_disable(vcpu);
> >       kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > +
> > +     if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > +             vcpu_set_on_unsupported_cpu(vcpu);
> >  }
> >
> >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >                */
> >               preempt_disable();
> >
> > +             if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > +                     vcpu_clear_on_unsupported_cpu(vcpu);
> > +                     run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > +                     run->fail_entry.hardware_entry_failure_reason
> > +                             = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > +                     run->fail_entry.cpu = smp_processor_id();
>
> I just realised that this is wrong for the same reason that KVM doesn't
> clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> after the vcpu_load that set the flag and before preemption is disabled
> could mean that now the thread is executing on a different physical CPU
> than the physical CPU that caused the flag to be set. To make things worse,
> this CPU might even be in supported_cpus, which would be extremely
> confusing for someone trying to descipher what went wrong.
>
> I see three solutions here:
>
> 1. Drop setting the fail_entry.cpu field.
>
> 2. Make vcpu_put clear the flag, which means that if the flag is set here
> then the VCPU is definitely executing on the wrong physical CPU and
> smp_processor_id() will be useful.
>
> 3. Carry the unsupported CPU ID information in a new field in struct
> kvm_vcpu_arch.
>
> I honestly don't have a preference. Maybe slightly towards solution number
> 2, as it makes the code symmetrical and removes the subtletly around when
> the VCPU flag is cleared. But this would be done at the expense of
> userspace possibly finding out a lot later (or never) that something went
> wrong.
>
> Thoughts?

IMHO, I would prefer 2, which is symmetrical and straightforward,
out of those three options.  Unless KVM checks the thread's CPU
affinity, userspace possibly finds that out a lot later anyway.

BTW, kvm_vcpu_pmu_restore_guest/kvm_vcpu_pmu_restore_host, which
are (indirectly) called from vcpu_load/vcpu_put, seems to attempt
to read/writes pmccfiltr_el0, which is present only when FEAT_PMUv3
is implemented, even if the current CPU does not support FEAT_PMUv3.

Thanks,
Reiji


>
> > +                     ret = 0;
> > +                     preempt_enable();
> > +                     break;
> > +             }
> > +
> >               kvm_pmu_flush_hwstate(vcpu);
> >
> >               local_irq_disable();
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index 618138c5f792..471fe0f734ed 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> >               arm_pmu = entry->arm_pmu;
> >               if (arm_pmu->pmu.type == pmu_id) {
> >                       kvm_pmu->arm_pmu = arm_pmu;
> > +                     cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
> >                       ret = 0;
> >                       goto out_unlock;
> >               }
> > --
> > 2.34.1
> >
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-08  2:36   ` Reiji Watanabe
@ 2021-12-08  8:05     ` Marc Zyngier
  -1 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08  8:05 UTC (permalink / raw)
  To: Reiji Watanabe; +Cc: will, mingo, tglx, kvmarm, linux-arm-kernel

Reji,

On 2021-12-08 02:36, Reiji Watanabe wrote:
> Hi Alex,
> 
> On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
>> 
>> (CC'ing Peter Maydell in case this might be of interest to qemu)
>> 
>> The series can be found on a branch at [1], and the kvmtool support at 
>> [2].
>> The kvmtool patches are also on the mailing list [3] and haven't 
>> changed
>> since v1.
>> 
>> Detailed explanation of the issue and symptoms that the patches 
>> attempt to
>> correct can be found in the cover letter for v1 [4].
>> 
>> A brief summary of the problem is that on heterogeneous systems KVM 
>> will
>> always use the same PMU for creating the VCPU events for *all* VCPUs
>> regardless of the physical CPU on which the VCPU is running, leading 
>> to
>> events suddenly stopping and resuming in the guest as the VCPU thread 
>> gets
>> migrated across different CPUs.
>> 
>> This series proposes to fix this behaviour by allowing the user to 
>> specify
>> which physical PMU is used when creating the VCPU events needed for 
>> guest
>> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
>> physical which is not part of the supported CPUs for the specified 
>> PMU.
> 
> Just to confirm, this series provides an API for userspace to request
> KVM to detect a wrong affinity setting due to a userspace bug so that
> userspace can get an error at KVM_RUN instead of leading to events
> suddenly stopping, correct ?

More than that, it allows userspace to select which PMU will be used
for their guest. The affinity setting is a byproduct of the PMU's own
affinity.

> 
>> The default behaviour stays the same - without userspace setting the 
>> PMU,
>> events will stop counting if the VCPU is scheduled on the wrong CPU.
> 
> Can't we fix the default behavior (in addition to the current fix) ?
> (Do we need to maintain the default behavior ??)

Of course we do. This is a behaviour that has been exposed to userspace
for years, and *we don't break userspace*.

> IMHO I feel it is better to prevent userspace from configuring PMU
> for guests on such heterogeneous systems rather than leading to
> events suddenly stopping even as the default behavior.

People running KVM on asymmetric systems *strongly* disagree with you.

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-08  8:05     ` Marc Zyngier
  0 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08  8:05 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Alexandru Elisei, james.morse, suzuki.poulose, will,
	mark.rutland, linux-arm-kernel, kvmarm, tglx, mingo

Reji,

On 2021-12-08 02:36, Reiji Watanabe wrote:
> Hi Alex,
> 
> On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
>> 
>> (CC'ing Peter Maydell in case this might be of interest to qemu)
>> 
>> The series can be found on a branch at [1], and the kvmtool support at 
>> [2].
>> The kvmtool patches are also on the mailing list [3] and haven't 
>> changed
>> since v1.
>> 
>> Detailed explanation of the issue and symptoms that the patches 
>> attempt to
>> correct can be found in the cover letter for v1 [4].
>> 
>> A brief summary of the problem is that on heterogeneous systems KVM 
>> will
>> always use the same PMU for creating the VCPU events for *all* VCPUs
>> regardless of the physical CPU on which the VCPU is running, leading 
>> to
>> events suddenly stopping and resuming in the guest as the VCPU thread 
>> gets
>> migrated across different CPUs.
>> 
>> This series proposes to fix this behaviour by allowing the user to 
>> specify
>> which physical PMU is used when creating the VCPU events needed for 
>> guest
>> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
>> physical which is not part of the supported CPUs for the specified 
>> PMU.
> 
> Just to confirm, this series provides an API for userspace to request
> KVM to detect a wrong affinity setting due to a userspace bug so that
> userspace can get an error at KVM_RUN instead of leading to events
> suddenly stopping, correct ?

More than that, it allows userspace to select which PMU will be used
for their guest. The affinity setting is a byproduct of the PMU's own
affinity.

> 
>> The default behaviour stays the same - without userspace setting the 
>> PMU,
>> events will stop counting if the VCPU is scheduled on the wrong CPU.
> 
> Can't we fix the default behavior (in addition to the current fix) ?
> (Do we need to maintain the default behavior ??)

Of course we do. This is a behaviour that has been exposed to userspace
for years, and *we don't break userspace*.

> IMHO I feel it is better to prevent userspace from configuring PMU
> for guests on such heterogeneous systems rather than leading to
> events suddenly stopping even as the default behavior.

People running KVM on asymmetric systems *strongly* disagree with you.

         M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-07 14:17     ` Alexandru Elisei
@ 2021-12-08  9:56       ` Marc Zyngier
  -1 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08  9:56 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Tue, 07 Dec 2021 14:17:56 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi,
> 
> On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > different PMU, the perf events needed to emulate a guest PMU won't be
> > scheduled in and the guest performance counters will stop counting. Treat
> > it as an userspace error and refuse to run the VCPU in this situation.
> > 
> > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > instead of in vcpu_put(); this has been done on purpose so the error
> > condition is communicated as soon as possible to userspace, otherwise
> > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > 
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> > I agonized for hours about the best name for the VCPU flag and the
> > accessors. If someone has a better idea, please tell me and I'll change
> > them.
> > 
> >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> >  arch/arm64/kvm/pmu-emul.c               |  1 +
> >  5 files changed, 40 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > index c82be5cbc268..9ae47b7c3652 100644
> > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> >  
> >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> >  associated with the PMU specified by this attribute. This is entirely left to
> > -userspace.
> > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > +by the PMU will fail and KVM_RUN will return with
> > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > +the cpu field to the processor id.
> >  
> >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> >  =================================
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 2a5f7f38006f..0c453f2e48b6 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> >  		u64 last_steal;
> >  		gpa_t base;
> >  	} steal;
> > +
> > +	cpumask_var_t supported_cpus;
> >  };
> >  
> >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
> > +#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
> >  
> >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> >  				 KVM_GUESTDBG_USE_SW_BP | \
> > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> >  #define vcpu_has_ptrauth(vcpu)		false
> >  #endif
> >  
> > +#define vcpu_on_unsupported_cpu(vcpu)					\
> > +	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_set_on_unsupported_cpu(vcpu)				\
> > +	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_clear_on_unsupported_cpu(vcpu)				\
> > +	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> >  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
> >  
> >  /*
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 1d0a0a2a9711..d49f714f48e6 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> >  #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
> >  #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
> >  
> > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
> > +
> >  #endif
> >  
> >  #endif /* __ARM_KVM_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..1124c3efdd94 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >  
> >  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> >  
> > +	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > +		return -ENOMEM;
> > +	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);

Nit: can we just assign the cpu_possible_mask pointer instead, and
only perform the allocation when assigning a specific PMU?

> > +
> >  	/* Set up the timer */
> >  	kvm_timer_vcpu_init(vcpu);
> >  
> > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> >  	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> >  		static_branch_dec(&userspace_irqchip_in_use);
> >  
> > +	free_cpumask_var(vcpu->arch.supported_cpus);
> >  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> >  	kvm_timer_vcpu_terminate(vcpu);
> >  	kvm_pmu_vcpu_destroy(vcpu);
> > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >  	if (vcpu_has_ptrauth(vcpu))
> >  		vcpu_ptrauth_disable(vcpu);
> >  	kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > +
> > +	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > +		vcpu_set_on_unsupported_cpu(vcpu);
> >  }
> >  
> >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >  		 */
> >  		preempt_disable();
> >  
> > +		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > +			vcpu_clear_on_unsupported_cpu(vcpu);
> > +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > +			run->fail_entry.hardware_entry_failure_reason
> > +				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > +			run->fail_entry.cpu = smp_processor_id();
>

Can you move this hunk to kvm_vcpu_exit_request()? It certainly would
fit better there, as we have checks for other exit reasons to
userspace.

> I just realised that this is wrong for the same reason that KVM doesn't
> clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> after the vcpu_load that set the flag and before preemption is disabled
> could mean that now the thread is executing on a different physical CPU
> than the physical CPU that caused the flag to be set. To make things worse,
> this CPU might even be in supported_cpus, which would be extremely
> confusing for someone trying to descipher what went wrong.
> 
> I see three solutions here:
> 
> 1. Drop setting the fail_entry.cpu field.
> 
> 2. Make vcpu_put clear the flag, which means that if the flag is set here
> then the VCPU is definitely executing on the wrong physical CPU and
> smp_processor_id() will be useful.

This looks reasonable to me.

> 
> 3. Carry the unsupported CPU ID information in a new field in struct
> kvm_vcpu_arch.
> 
> I honestly don't have a preference. Maybe slightly towards solution number
> 2, as it makes the code symmetrical and removes the subtletly around when
> the VCPU flag is cleared. But this would be done at the expense of
> userspace possibly finding out a lot later (or never) that something went
> wrong.

I don't really get your argument about "userspace possibly finding out
a lot later...". Yes, if the vcpu gets migrated to a 'good' CPU after
a sequence of put/load, userspace will be lucky. But that's the rule
of the game. If userspace pins the vcpu to the wrong CPU type, then
the information will be consistent.

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-08  9:56       ` Marc Zyngier
  0 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08  9:56 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

On Tue, 07 Dec 2021 14:17:56 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi,
> 
> On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > different PMU, the perf events needed to emulate a guest PMU won't be
> > scheduled in and the guest performance counters will stop counting. Treat
> > it as an userspace error and refuse to run the VCPU in this situation.
> > 
> > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > instead of in vcpu_put(); this has been done on purpose so the error
> > condition is communicated as soon as possible to userspace, otherwise
> > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > 
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> > I agonized for hours about the best name for the VCPU flag and the
> > accessors. If someone has a better idea, please tell me and I'll change
> > them.
> > 
> >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> >  arch/arm64/kvm/pmu-emul.c               |  1 +
> >  5 files changed, 40 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > index c82be5cbc268..9ae47b7c3652 100644
> > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> >  
> >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> >  associated with the PMU specified by this attribute. This is entirely left to
> > -userspace.
> > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > +by the PMU will fail and KVM_RUN will return with
> > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > +the cpu field to the processor id.
> >  
> >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> >  =================================
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 2a5f7f38006f..0c453f2e48b6 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> >  		u64 last_steal;
> >  		gpa_t base;
> >  	} steal;
> > +
> > +	cpumask_var_t supported_cpus;
> >  };
> >  
> >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
> >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
> > +#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
> >  
> >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> >  				 KVM_GUESTDBG_USE_SW_BP | \
> > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> >  #define vcpu_has_ptrauth(vcpu)		false
> >  #endif
> >  
> > +#define vcpu_on_unsupported_cpu(vcpu)					\
> > +	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_set_on_unsupported_cpu(vcpu)				\
> > +	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> > +#define vcpu_clear_on_unsupported_cpu(vcpu)				\
> > +	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > +
> >  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
> >  
> >  /*
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 1d0a0a2a9711..d49f714f48e6 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> >  #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
> >  #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
> >  
> > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
> > +
> >  #endif
> >  
> >  #endif /* __ARM_KVM_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..1124c3efdd94 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >  
> >  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> >  
> > +	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > +		return -ENOMEM;
> > +	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);

Nit: can we just assign the cpu_possible_mask pointer instead, and
only perform the allocation when assigning a specific PMU?

> > +
> >  	/* Set up the timer */
> >  	kvm_timer_vcpu_init(vcpu);
> >  
> > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> >  	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> >  		static_branch_dec(&userspace_irqchip_in_use);
> >  
> > +	free_cpumask_var(vcpu->arch.supported_cpus);
> >  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> >  	kvm_timer_vcpu_terminate(vcpu);
> >  	kvm_pmu_vcpu_destroy(vcpu);
> > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >  	if (vcpu_has_ptrauth(vcpu))
> >  		vcpu_ptrauth_disable(vcpu);
> >  	kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > +
> > +	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > +		vcpu_set_on_unsupported_cpu(vcpu);
> >  }
> >  
> >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >  		 */
> >  		preempt_disable();
> >  
> > +		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > +			vcpu_clear_on_unsupported_cpu(vcpu);
> > +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > +			run->fail_entry.hardware_entry_failure_reason
> > +				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > +			run->fail_entry.cpu = smp_processor_id();
>

Can you move this hunk to kvm_vcpu_exit_request()? It certainly would
fit better there, as we have checks for other exit reasons to
userspace.

> I just realised that this is wrong for the same reason that KVM doesn't
> clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> after the vcpu_load that set the flag and before preemption is disabled
> could mean that now the thread is executing on a different physical CPU
> than the physical CPU that caused the flag to be set. To make things worse,
> this CPU might even be in supported_cpus, which would be extremely
> confusing for someone trying to descipher what went wrong.
> 
> I see three solutions here:
> 
> 1. Drop setting the fail_entry.cpu field.
> 
> 2. Make vcpu_put clear the flag, which means that if the flag is set here
> then the VCPU is definitely executing on the wrong physical CPU and
> smp_processor_id() will be useful.

This looks reasonable to me.

> 
> 3. Carry the unsupported CPU ID information in a new field in struct
> kvm_vcpu_arch.
> 
> I honestly don't have a preference. Maybe slightly towards solution number
> 2, as it makes the code symmetrical and removes the subtletly around when
> the VCPU flag is cleared. But this would be done at the expense of
> userspace possibly finding out a lot later (or never) that something went
> wrong.

I don't really get your argument about "userspace possibly finding out
a lot later...". Yes, if the vcpu gets migrated to a 'good' CPU after
a sequence of put/load, userspace will be lucky. But that's the rule
of the game. If userspace pins the vcpu to the wrong CPU type, then
the information will be consistent.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-08  7:54       ` Reiji Watanabe
@ 2021-12-08 10:38         ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 10:38 UTC (permalink / raw)
  To: Reiji Watanabe; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Reiji,

Thank you for the review!

On Tue, Dec 07, 2021 at 11:54:51PM -0800, Reiji Watanabe wrote:
> Hi Alex,
> 
> On Tue, Dec 7, 2021 at 6:18 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
> >
> > Hi,
> >
> > On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > > different PMU, the perf events needed to emulate a guest PMU won't be
> > > scheduled in and the guest performance counters will stop counting. Treat
> > > it as an userspace error and refuse to run the VCPU in this situation.
> > >
> > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > > instead of in vcpu_put(); this has been done on purpose so the error
> > > condition is communicated as soon as possible to userspace, otherwise
> > > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > >
> > > Suggested-by: Marc Zyngier <maz@kernel.org>
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > > I agonized for hours about the best name for the VCPU flag and the
> > > accessors. If someone has a better idea, please tell me and I'll change
> > > them.
> > >
> > >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> > >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> > >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> > >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> > >  arch/arm64/kvm/pmu-emul.c               |  1 +
> > >  5 files changed, 40 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > index c82be5cbc268..9ae47b7c3652 100644
> > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> > >
> > >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > >  associated with the PMU specified by this attribute. This is entirely left to
> > > -userspace.
> > > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > > +by the PMU will fail and KVM_RUN will return with
> > > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > > +the cpu field to the processor id.
> > >
> > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > >  =================================
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 2a5f7f38006f..0c453f2e48b6 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> > >               u64 last_steal;
> > >               gpa_t base;
> > >       } steal;
> > > +
> > > +     cpumask_var_t supported_cpus;
> > >  };
> > >
> > >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> > >  #define KVM_ARM64_EXCEPT_MASK                (7 << 9) /* Target EL/MODE */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE       (1 << 12) /* Save SPE context if active  */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE      (1 << 13) /* Save TRBE context if active  */
> > > +#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 14) /* Physical CPU not in supported_cpus */
> > >
> > >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> > >                                KVM_GUESTDBG_USE_SW_BP | \
> > > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> > >  #define vcpu_has_ptrauth(vcpu)               false
> > >  #endif
> > >
> > > +#define vcpu_on_unsupported_cpu(vcpu)                                        \
> > > +     ((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_set_on_unsupported_cpu(vcpu)                            \
> > > +     ((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_clear_on_unsupported_cpu(vcpu)                          \
> > > +     ((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > >  #define vcpu_gp_regs(v)              (&(v)->arch.ctxt.regs)
> > >
> > >  /*
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index 1d0a0a2a9711..d49f714f48e6 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> > >  #define KVM_PSCI_RET_INVAL           PSCI_RET_INVALID_PARAMS
> > >  #define KVM_PSCI_RET_DENIED          PSCI_RET_DENIED
> > >
> > > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED  (1ULL << 0)
> > > +
> > >  #endif
> > >
> > >  #endif /* __ARM_KVM_H__ */
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index e4727dc771bf..1124c3efdd94 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > >
> > >       vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> > >
> > > +     if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > > +             return -ENOMEM;
> 
> It appears that vcpu->arch.supported_cpus needs to be freed
> if kvm_arch_vcpu_create() fails after it is allocated.
> (kvm_vgic_vcpu_init() or create_hyp_mappings() might fail)

I missed that, thank you for pointing it out.

> 
> 
> > > +     cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> > > +
> > >       /* Set up the timer */
> > >       kvm_timer_vcpu_init(vcpu);
> > >
> > > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > >       if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> > >               static_branch_dec(&userspace_irqchip_in_use);
> > >
> > > +     free_cpumask_var(vcpu->arch.supported_cpus);
> > >       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > >       kvm_timer_vcpu_terminate(vcpu);
> > >       kvm_pmu_vcpu_destroy(vcpu);
> > > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > >       if (vcpu_has_ptrauth(vcpu))
> > >               vcpu_ptrauth_disable(vcpu);
> > >       kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > > +
> > > +     if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > > +             vcpu_set_on_unsupported_cpu(vcpu);
> > >  }
> > >
> > >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > >                */
> > >               preempt_disable();
> > >
> > > +             if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > > +                     vcpu_clear_on_unsupported_cpu(vcpu);
> > > +                     run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > > +                     run->fail_entry.hardware_entry_failure_reason
> > > +                             = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > > +                     run->fail_entry.cpu = smp_processor_id();
> >
> > I just realised that this is wrong for the same reason that KVM doesn't
> > clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> > after the vcpu_load that set the flag and before preemption is disabled
> > could mean that now the thread is executing on a different physical CPU
> > than the physical CPU that caused the flag to be set. To make things worse,
> > this CPU might even be in supported_cpus, which would be extremely
> > confusing for someone trying to descipher what went wrong.
> >
> > I see three solutions here:
> >
> > 1. Drop setting the fail_entry.cpu field.
> >
> > 2. Make vcpu_put clear the flag, which means that if the flag is set here
> > then the VCPU is definitely executing on the wrong physical CPU and
> > smp_processor_id() will be useful.
> >
> > 3. Carry the unsupported CPU ID information in a new field in struct
> > kvm_vcpu_arch.
> >
> > I honestly don't have a preference. Maybe slightly towards solution number
> > 2, as it makes the code symmetrical and removes the subtletly around when
> > the VCPU flag is cleared. But this would be done at the expense of
> > userspace possibly finding out a lot later (or never) that something went
> > wrong.
> >
> > Thoughts?
> 
> IMHO, I would prefer 2, which is symmetrical and straightforward,
> out of those three options.  Unless KVM checks the thread's CPU
> affinity, userspace possibly finds that out a lot later anyway.

Agreed.

> 
> BTW, kvm_vcpu_pmu_restore_guest/kvm_vcpu_pmu_restore_host, which
> are (indirectly) called from vcpu_load/vcpu_put, seems to attempt
> to read/writes pmccfiltr_el0, which is present only when FEAT_PMUv3
> is implemented, even if the current CPU does not support FEAT_PMUv3.

I think that's a different problem, independent of this patch. There are
other places where KVM touches the PMU registers based on
kvm_arm_support_pmu_v3() instead of checking that the CPU has a PMU
(__activate_traps_common() comes to mind). As far as I can tell, this
unusual configuration works with perf because perf calls
pmu->filter_match() before scheduling in an event, although I haven't heard
of such a SoC existing (does not mean it doesn't exist!).

Thanks,
Alex

> 
> Thanks,
> Reiji
> 
> 
> >
> > > +                     ret = 0;
> > > +                     preempt_enable();
> > > +                     break;
> > > +             }
> > > +
> > >               kvm_pmu_flush_hwstate(vcpu);
> > >
> > >               local_irq_disable();
> > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > > index 618138c5f792..471fe0f734ed 100644
> > > --- a/arch/arm64/kvm/pmu-emul.c
> > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > @@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > >               arm_pmu = entry->arm_pmu;
> > >               if (arm_pmu->pmu.type == pmu_id) {
> > >                       kvm_pmu->arm_pmu = arm_pmu;
> > > +                     cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
> > >                       ret = 0;
> > >                       goto out_unlock;
> > >               }
> > > --
> > > 2.34.1
> > >
> > > _______________________________________________
> > > kvmarm mailing list
> > > kvmarm@lists.cs.columbia.edu
> > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-08 10:38         ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 10:38 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Reiji,

Thank you for the review!

On Tue, Dec 07, 2021 at 11:54:51PM -0800, Reiji Watanabe wrote:
> Hi Alex,
> 
> On Tue, Dec 7, 2021 at 6:18 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
> >
> > Hi,
> >
> > On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > > different PMU, the perf events needed to emulate a guest PMU won't be
> > > scheduled in and the guest performance counters will stop counting. Treat
> > > it as an userspace error and refuse to run the VCPU in this situation.
> > >
> > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > > instead of in vcpu_put(); this has been done on purpose so the error
> > > condition is communicated as soon as possible to userspace, otherwise
> > > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > >
> > > Suggested-by: Marc Zyngier <maz@kernel.org>
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > > I agonized for hours about the best name for the VCPU flag and the
> > > accessors. If someone has a better idea, please tell me and I'll change
> > > them.
> > >
> > >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> > >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> > >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> > >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> > >  arch/arm64/kvm/pmu-emul.c               |  1 +
> > >  5 files changed, 40 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > index c82be5cbc268..9ae47b7c3652 100644
> > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> > >
> > >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > >  associated with the PMU specified by this attribute. This is entirely left to
> > > -userspace.
> > > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > > +by the PMU will fail and KVM_RUN will return with
> > > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > > +the cpu field to the processor id.
> > >
> > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > >  =================================
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 2a5f7f38006f..0c453f2e48b6 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> > >               u64 last_steal;
> > >               gpa_t base;
> > >       } steal;
> > > +
> > > +     cpumask_var_t supported_cpus;
> > >  };
> > >
> > >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> > >  #define KVM_ARM64_EXCEPT_MASK                (7 << 9) /* Target EL/MODE */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE       (1 << 12) /* Save SPE context if active  */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE      (1 << 13) /* Save TRBE context if active  */
> > > +#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 14) /* Physical CPU not in supported_cpus */
> > >
> > >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> > >                                KVM_GUESTDBG_USE_SW_BP | \
> > > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> > >  #define vcpu_has_ptrauth(vcpu)               false
> > >  #endif
> > >
> > > +#define vcpu_on_unsupported_cpu(vcpu)                                        \
> > > +     ((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_set_on_unsupported_cpu(vcpu)                            \
> > > +     ((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_clear_on_unsupported_cpu(vcpu)                          \
> > > +     ((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > >  #define vcpu_gp_regs(v)              (&(v)->arch.ctxt.regs)
> > >
> > >  /*
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index 1d0a0a2a9711..d49f714f48e6 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> > >  #define KVM_PSCI_RET_INVAL           PSCI_RET_INVALID_PARAMS
> > >  #define KVM_PSCI_RET_DENIED          PSCI_RET_DENIED
> > >
> > > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED  (1ULL << 0)
> > > +
> > >  #endif
> > >
> > >  #endif /* __ARM_KVM_H__ */
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index e4727dc771bf..1124c3efdd94 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > >
> > >       vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> > >
> > > +     if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > > +             return -ENOMEM;
> 
> It appears that vcpu->arch.supported_cpus needs to be freed
> if kvm_arch_vcpu_create() fails after it is allocated.
> (kvm_vgic_vcpu_init() or create_hyp_mappings() might fail)

I missed that, thank you for pointing it out.

> 
> 
> > > +     cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> > > +
> > >       /* Set up the timer */
> > >       kvm_timer_vcpu_init(vcpu);
> > >
> > > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > >       if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> > >               static_branch_dec(&userspace_irqchip_in_use);
> > >
> > > +     free_cpumask_var(vcpu->arch.supported_cpus);
> > >       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > >       kvm_timer_vcpu_terminate(vcpu);
> > >       kvm_pmu_vcpu_destroy(vcpu);
> > > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > >       if (vcpu_has_ptrauth(vcpu))
> > >               vcpu_ptrauth_disable(vcpu);
> > >       kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > > +
> > > +     if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > > +             vcpu_set_on_unsupported_cpu(vcpu);
> > >  }
> > >
> > >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > >                */
> > >               preempt_disable();
> > >
> > > +             if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > > +                     vcpu_clear_on_unsupported_cpu(vcpu);
> > > +                     run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > > +                     run->fail_entry.hardware_entry_failure_reason
> > > +                             = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > > +                     run->fail_entry.cpu = smp_processor_id();
> >
> > I just realised that this is wrong for the same reason that KVM doesn't
> > clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> > after the vcpu_load that set the flag and before preemption is disabled
> > could mean that now the thread is executing on a different physical CPU
> > than the physical CPU that caused the flag to be set. To make things worse,
> > this CPU might even be in supported_cpus, which would be extremely
> > confusing for someone trying to descipher what went wrong.
> >
> > I see three solutions here:
> >
> > 1. Drop setting the fail_entry.cpu field.
> >
> > 2. Make vcpu_put clear the flag, which means that if the flag is set here
> > then the VCPU is definitely executing on the wrong physical CPU and
> > smp_processor_id() will be useful.
> >
> > 3. Carry the unsupported CPU ID information in a new field in struct
> > kvm_vcpu_arch.
> >
> > I honestly don't have a preference. Maybe slightly towards solution number
> > 2, as it makes the code symmetrical and removes the subtletly around when
> > the VCPU flag is cleared. But this would be done at the expense of
> > userspace possibly finding out a lot later (or never) that something went
> > wrong.
> >
> > Thoughts?
> 
> IMHO, I would prefer 2, which is symmetrical and straightforward,
> out of those three options.  Unless KVM checks the thread's CPU
> affinity, userspace possibly finds that out a lot later anyway.

Agreed.

> 
> BTW, kvm_vcpu_pmu_restore_guest/kvm_vcpu_pmu_restore_host, which
> are (indirectly) called from vcpu_load/vcpu_put, seems to attempt
> to read/writes pmccfiltr_el0, which is present only when FEAT_PMUv3
> is implemented, even if the current CPU does not support FEAT_PMUv3.

I think that's a different problem, independent of this patch. There are
other places where KVM touches the PMU registers based on
kvm_arm_support_pmu_v3() instead of checking that the CPU has a PMU
(__activate_traps_common() comes to mind). As far as I can tell, this
unusual configuration works with perf because perf calls
pmu->filter_match() before scheduling in an event, although I haven't heard
of such a SoC existing (does not mean it doesn't exist!).

Thanks,
Alex

> 
> Thanks,
> Reiji
> 
> 
> >
> > > +                     ret = 0;
> > > +                     preempt_enable();
> > > +                     break;
> > > +             }
> > > +
> > >               kvm_pmu_flush_hwstate(vcpu);
> > >
> > >               local_irq_disable();
> > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > > index 618138c5f792..471fe0f734ed 100644
> > > --- a/arch/arm64/kvm/pmu-emul.c
> > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > @@ -954,6 +954,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > >               arm_pmu = entry->arm_pmu;
> > >               if (arm_pmu->pmu.type == pmu_id) {
> > >                       kvm_pmu->arm_pmu = arm_pmu;
> > > +                     cpumask_copy(vcpu->arch.supported_cpus, &arm_pmu->supported_cpus);
> > >                       ret = 0;
> > >                       goto out_unlock;
> > >               }
> > > --
> > > 2.34.1
> > >
> > > _______________________________________________
> > > kvmarm mailing list
> > > kvmarm@lists.cs.columbia.edu
> > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-08  9:56       ` Marc Zyngier
@ 2021-12-08 11:18         ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 11:18 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Marc,

On Wed, Dec 08, 2021 at 09:56:20AM +0000, Marc Zyngier wrote:
> On Tue, 07 Dec 2021 14:17:56 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi,
> > 
> > On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > > different PMU, the perf events needed to emulate a guest PMU won't be
> > > scheduled in and the guest performance counters will stop counting. Treat
> > > it as an userspace error and refuse to run the VCPU in this situation.
> > > 
> > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > > instead of in vcpu_put(); this has been done on purpose so the error
> > > condition is communicated as soon as possible to userspace, otherwise
> > > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > > 
> > > Suggested-by: Marc Zyngier <maz@kernel.org>
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > > I agonized for hours about the best name for the VCPU flag and the
> > > accessors. If someone has a better idea, please tell me and I'll change
> > > them.
> > > 
> > >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> > >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> > >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> > >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> > >  arch/arm64/kvm/pmu-emul.c               |  1 +
> > >  5 files changed, 40 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > index c82be5cbc268..9ae47b7c3652 100644
> > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> > >  
> > >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > >  associated with the PMU specified by this attribute. This is entirely left to
> > > -userspace.
> > > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > > +by the PMU will fail and KVM_RUN will return with
> > > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > > +the cpu field to the processor id.
> > >  
> > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > >  =================================
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 2a5f7f38006f..0c453f2e48b6 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> > >  		u64 last_steal;
> > >  		gpa_t base;
> > >  	} steal;
> > > +
> > > +	cpumask_var_t supported_cpus;
> > >  };
> > >  
> > >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> > >  #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
> > > +#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
> > >  
> > >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> > >  				 KVM_GUESTDBG_USE_SW_BP | \
> > > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> > >  #define vcpu_has_ptrauth(vcpu)		false
> > >  #endif
> > >  
> > > +#define vcpu_on_unsupported_cpu(vcpu)					\
> > > +	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_set_on_unsupported_cpu(vcpu)				\
> > > +	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_clear_on_unsupported_cpu(vcpu)				\
> > > +	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > >  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
> > >  
> > >  /*
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index 1d0a0a2a9711..d49f714f48e6 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> > >  #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
> > >  #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
> > >  
> > > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
> > > +
> > >  #endif
> > >  
> > >  #endif /* __ARM_KVM_H__ */
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index e4727dc771bf..1124c3efdd94 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > >  
> > >  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> > >  
> > > +	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > > +		return -ENOMEM;
> > > +	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> 
> Nit: can we just assign the cpu_possible_mask pointer instead, and
> only perform the allocation when assigning a specific PMU?
> 
> > > +
> > >  	/* Set up the timer */
> > >  	kvm_timer_vcpu_init(vcpu);
> > >  
> > > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > >  	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> > >  		static_branch_dec(&userspace_irqchip_in_use);
> > >  
> > > +	free_cpumask_var(vcpu->arch.supported_cpus);
> > >  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > >  	kvm_timer_vcpu_terminate(vcpu);
> > >  	kvm_pmu_vcpu_destroy(vcpu);
> > > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > >  	if (vcpu_has_ptrauth(vcpu))
> > >  		vcpu_ptrauth_disable(vcpu);
> > >  	kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > > +
> > > +	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > > +		vcpu_set_on_unsupported_cpu(vcpu);
> > >  }
> > >  
> > >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > >  		 */
> > >  		preempt_disable();
> > >  
> > > +		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > > +			vcpu_clear_on_unsupported_cpu(vcpu);
> > > +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > > +			run->fail_entry.hardware_entry_failure_reason
> > > +				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > > +			run->fail_entry.cpu = smp_processor_id();
> >
> 
> Can you move this hunk to kvm_vcpu_exit_request()? It certainly would
> fit better there, as we have checks for other exit reasons to
> userspace.

That's a great idea, I'll move it there.

> 
> > I just realised that this is wrong for the same reason that KVM doesn't
> > clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> > after the vcpu_load that set the flag and before preemption is disabled
> > could mean that now the thread is executing on a different physical CPU
> > than the physical CPU that caused the flag to be set. To make things worse,
> > this CPU might even be in supported_cpus, which would be extremely
> > confusing for someone trying to descipher what went wrong.
> > 
> > I see three solutions here:
> > 
> > 1. Drop setting the fail_entry.cpu field.
> > 
> > 2. Make vcpu_put clear the flag, which means that if the flag is set here
> > then the VCPU is definitely executing on the wrong physical CPU and
> > smp_processor_id() will be useful.
> 
> This looks reasonable to me.

Yep, already answered to Reiji, I'm going to take this approach.

> 
> > 
> > 3. Carry the unsupported CPU ID information in a new field in struct
> > kvm_vcpu_arch.
> > 
> > I honestly don't have a preference. Maybe slightly towards solution number
> > 2, as it makes the code symmetrical and removes the subtletly around when
> > the VCPU flag is cleared. But this would be done at the expense of
> > userspace possibly finding out a lot later (or never) that something went
> > wrong.
> 
> I don't really get your argument about "userspace possibly finding out
> a lot later...". Yes, if the vcpu gets migrated to a 'good' CPU after
> a sequence of put/load, userspace will be lucky. But that's the rule
> of the game. If userspace pins the vcpu to the wrong CPU type, then
> the information will be consistent.

Yes, I agree.

Thanks,
Alex

> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-08 11:18         ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 11:18 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Marc,

On Wed, Dec 08, 2021 at 09:56:20AM +0000, Marc Zyngier wrote:
> On Tue, 07 Dec 2021 14:17:56 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi,
> > 
> > On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > > different PMU, the perf events needed to emulate a guest PMU won't be
> > > scheduled in and the guest performance counters will stop counting. Treat
> > > it as an userspace error and refuse to run the VCPU in this situation.
> > > 
> > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > > instead of in vcpu_put(); this has been done on purpose so the error
> > > condition is communicated as soon as possible to userspace, otherwise
> > > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > > 
> > > Suggested-by: Marc Zyngier <maz@kernel.org>
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > > I agonized for hours about the best name for the VCPU flag and the
> > > accessors. If someone has a better idea, please tell me and I'll change
> > > them.
> > > 
> > >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> > >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> > >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> > >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> > >  arch/arm64/kvm/pmu-emul.c               |  1 +
> > >  5 files changed, 40 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > index c82be5cbc268..9ae47b7c3652 100644
> > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> > >  
> > >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > >  associated with the PMU specified by this attribute. This is entirely left to
> > > -userspace.
> > > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > > +by the PMU will fail and KVM_RUN will return with
> > > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > > +the cpu field to the processor id.
> > >  
> > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > >  =================================
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 2a5f7f38006f..0c453f2e48b6 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> > >  		u64 last_steal;
> > >  		gpa_t base;
> > >  	} steal;
> > > +
> > > +	cpumask_var_t supported_cpus;
> > >  };
> > >  
> > >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> > >  #define KVM_ARM64_EXCEPT_MASK		(7 << 9) /* Target EL/MODE */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE	(1 << 12) /* Save SPE context if active  */
> > >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE	(1 << 13) /* Save TRBE context if active  */
> > > +#define KVM_ARM64_ON_UNSUPPORTED_CPU	(1 << 14) /* Physical CPU not in supported_cpus */
> > >  
> > >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> > >  				 KVM_GUESTDBG_USE_SW_BP | \
> > > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> > >  #define vcpu_has_ptrauth(vcpu)		false
> > >  #endif
> > >  
> > > +#define vcpu_on_unsupported_cpu(vcpu)					\
> > > +	((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_set_on_unsupported_cpu(vcpu)				\
> > > +	((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > > +#define vcpu_clear_on_unsupported_cpu(vcpu)				\
> > > +	((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > +
> > >  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.regs)
> > >  
> > >  /*
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index 1d0a0a2a9711..d49f714f48e6 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> > >  #define KVM_PSCI_RET_INVAL		PSCI_RET_INVALID_PARAMS
> > >  #define KVM_PSCI_RET_DENIED		PSCI_RET_DENIED
> > >  
> > > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED	(1ULL << 0)
> > > +
> > >  #endif
> > >  
> > >  #endif /* __ARM_KVM_H__ */
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index e4727dc771bf..1124c3efdd94 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > >  
> > >  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> > >  
> > > +	if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > > +		return -ENOMEM;
> > > +	cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> 
> Nit: can we just assign the cpu_possible_mask pointer instead, and
> only perform the allocation when assigning a specific PMU?
> 
> > > +
> > >  	/* Set up the timer */
> > >  	kvm_timer_vcpu_init(vcpu);
> > >  
> > > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > >  	if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> > >  		static_branch_dec(&userspace_irqchip_in_use);
> > >  
> > > +	free_cpumask_var(vcpu->arch.supported_cpus);
> > >  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > >  	kvm_timer_vcpu_terminate(vcpu);
> > >  	kvm_pmu_vcpu_destroy(vcpu);
> > > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > >  	if (vcpu_has_ptrauth(vcpu))
> > >  		vcpu_ptrauth_disable(vcpu);
> > >  	kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > > +
> > > +	if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > > +		vcpu_set_on_unsupported_cpu(vcpu);
> > >  }
> > >  
> > >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > >  		 */
> > >  		preempt_disable();
> > >  
> > > +		if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > > +			vcpu_clear_on_unsupported_cpu(vcpu);
> > > +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > > +			run->fail_entry.hardware_entry_failure_reason
> > > +				= KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > > +			run->fail_entry.cpu = smp_processor_id();
> >
> 
> Can you move this hunk to kvm_vcpu_exit_request()? It certainly would
> fit better there, as we have checks for other exit reasons to
> userspace.

That's a great idea, I'll move it there.

> 
> > I just realised that this is wrong for the same reason that KVM doesn't
> > clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> > after the vcpu_load that set the flag and before preemption is disabled
> > could mean that now the thread is executing on a different physical CPU
> > than the physical CPU that caused the flag to be set. To make things worse,
> > this CPU might even be in supported_cpus, which would be extremely
> > confusing for someone trying to descipher what went wrong.
> > 
> > I see three solutions here:
> > 
> > 1. Drop setting the fail_entry.cpu field.
> > 
> > 2. Make vcpu_put clear the flag, which means that if the flag is set here
> > then the VCPU is definitely executing on the wrong physical CPU and
> > smp_processor_id() will be useful.
> 
> This looks reasonable to me.

Yep, already answered to Reiji, I'm going to take this approach.

> 
> > 
> > 3. Carry the unsupported CPU ID information in a new field in struct
> > kvm_vcpu_arch.
> > 
> > I honestly don't have a preference. Maybe slightly towards solution number
> > 2, as it makes the code symmetrical and removes the subtletly around when
> > the VCPU flag is cleared. But this would be done at the expense of
> > userspace possibly finding out a lot later (or never) that something went
> > wrong.
> 
> I don't really get your argument about "userspace possibly finding out
> a lot later...". Yes, if the vcpu gets migrated to a 'good' CPU after
> a sequence of put/load, userspace will be lucky. But that's the rule
> of the game. If userspace pins the vcpu to the wrong CPU type, then
> the information will be consistent.

Yes, I agree.

Thanks,
Alex

> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-08  3:13     ` Reiji Watanabe
@ 2021-12-08 12:23       ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 12:23 UTC (permalink / raw)
  To: Reiji Watanabe; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Reiji,

On Tue, Dec 07, 2021 at 07:13:17PM -0800, Reiji Watanabe wrote:
> Hi Alex,
> 
> On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
> >
> > When KVM creates an event and there are more than one PMUs present on the
> > system, perf_init_event() will go through the list of available PMUs and
> > will choose the first one that can create the event. The order of the PMUs
> > in the PMU list depends on the probe order, which can change under various
> > circumstances, for example if the order of the PMU nodes change in the DTB
> > or if asynchronous driver probing is enabled on the kernel command line
> > (with the driver_async_probe=armv8-pmu option).
> >
> > Another consequence of this approach is that, on heteregeneous systems,
> > all virtual machines that KVM creates will use the same PMU. This might
> > cause unexpected behaviour for userspace: when a VCPU is executing on
> > the physical CPU that uses this PMU, PMU events in the guest work
> > correctly; but when the same VCPU executes on another CPU, PMU events in
> > the guest will suddenly stop counting.
> >
> > Fortunately, perf core allows user to specify on which PMU to create an
> > event by using the perf_event_attr->type field, which is used by
> > perf_init_event() as an index in the radix tree of available PMUs.
> >
> > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > the physical CPUs that share this PMU, leaving it up to userspace to
> > manage the VCPU threads' affinity accordingly.
> >
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
> >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> >  arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
> >  include/kvm/arm_pmu.h                   |  1 +
> >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  5 files changed, 63 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > index 60a29972d3f1..c82be5cbc268 100644
> > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > @@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
> >  isn't strictly speaking an event. Filtering the cycle counter is possible
> >  using event 0x11 (CPU_CYCLES).
> >
> > +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> > +------------------------------------------
> > +
> > +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> > +             identifier.
> > +
> > +:Returns:
> > +
> > +        =======  ===============================================
> > +        -EBUSY   PMUv3 already initialized
> > +        -EFAULT  Error accessing the PMU identifier
> > +        -ENXIO   PMU not found
> > +        -ENODEV  PMUv3 not supported or GIC not initialized
> > +        -ENOMEM  Could not allocate memory
> > +        =======  ===============================================
> > +
> > +Request that the VCPU uses the specified hardware PMU when creating guest events
> > +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> > +file for the desired PMU instance under /sys/devices (or, equivalent,
> > +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> > +systems where there are at least two CPU PMUs on the system.
> > +
> > +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > +associated with the PMU specified by this attribute. This is entirely left to
> > +userspace.
> >
> >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> >  =================================
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index b3edde68bc3e..1d0a0a2a9711 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index eaaad4c06561..618138c5f792 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> >  {
> >         struct kvm_pmu *pmu = &vcpu->arch.pmu;
> > +       struct arm_pmu *arm_pmu = pmu->arm_pmu;
> >         struct kvm_pmc *pmc;
> >         struct perf_event *event;
> >         struct perf_event_attr attr;
> > @@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> >                 return;
> >
> >         memset(&attr, 0, sizeof(struct perf_event_attr));
> > -       attr.type = PERF_TYPE_RAW;
> > -       attr.size = sizeof(attr);
> > +       attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
> >         attr.pinned = 1;
> >         attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
> >         attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> > @@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
> >         return true;
> >  }
> >
> > +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > +{
> > +       struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
> > +       struct arm_pmu_entry *entry;
> > +       struct arm_pmu *arm_pmu;
> > +       int ret = -ENXIO;
> > +
> > +       mutex_lock(&arm_pmus_lock);
> > +
> > +       list_for_each_entry(entry, &arm_pmus, entry) {
> > +               arm_pmu = entry->arm_pmu;
> > +               if (arm_pmu->pmu.type == pmu_id) {
> > +                       kvm_pmu->arm_pmu = arm_pmu;
> 
> Shouldn't kvm->arch.pmuver be updated based on the pmu that
> is used for the guest ?

As far as I can tell, kvm->arch.pmuver is used in kvm_pmu_event_mask() to
get the number of available perf events, which is then used for configuring
events (via the PMEVTYPER_EL0) register or for masking out events when the
guest reads PMCEID{0,1}_EL0; the events that are masked out are the events
that are unsupported by the PMU that perf will choose for creating events.

This series doesn't forbid userspace from setting the PMU for only a subset
of VCPUs, leaving the other VCPUs with the default PMU, so setting
kvm->arch.pmuver to a particular VCPU's PMU is not correct.

I think the correct fix here would be to have kvm_pmu_event_mask() use the
VCPU's PMU PMUVer, and fallback to kvm->arch.pmuver if that isn't set.

This makes me wonder. Should KVM enforce having userspace either not
setting the PMU for any VCPU, either setting it for all VCPUs? I think this
would be a good idea and will reduce complexity in the long run. I also
don't see a use case for userspace choosing to set the PMU for a subset of
VCPUs, leaving the other VCPUs with the default behaviour.

Thanks,
Alex

> 
> Thanks,
> Reiji
> 
> 
> > +                       ret = 0;
> > +                       goto out_unlock;
> > +               }
> > +       }
> > +
> > +out_unlock:
> > +       mutex_unlock(&arm_pmus_lock);
> > +       return ret;
> > +}
> > +
> >  int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >  {
> >         if (!kvm_vcpu_has_pmu(vcpu))
> > @@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >
> >                 return 0;
> >         }
> > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
> > +               int __user *uaddr = (int __user *)(long)attr->addr;
> > +               int pmu_id;
> > +
> > +               if (get_user(pmu_id, uaddr))
> > +                       return -EFAULT;
> > +
> > +               return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
> > +       }
> >         case KVM_ARM_VCPU_PMU_V3_INIT:
> >                 return kvm_arm_pmu_v3_init(vcpu);
> >         }
> > @@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >         case KVM_ARM_VCPU_PMU_V3_IRQ:
> >         case KVM_ARM_VCPU_PMU_V3_INIT:
> >         case KVM_ARM_VCPU_PMU_V3_FILTER:
> > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU:
> >                 if (kvm_vcpu_has_pmu(vcpu))
> >                         return 0;
> >         }
> > diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> > index e249c5f172aa..ab3046a8f9bb 100644
> > --- a/include/kvm/arm_pmu.h
> > +++ b/include/kvm/arm_pmu.h
> > @@ -34,6 +34,7 @@ struct kvm_pmu {
> >         bool created;
> >         bool irq_level;
> >         struct irq_work overflow_work;
> > +       struct arm_pmu *arm_pmu;
> >  };
> >
> >  struct arm_pmu_entry {
> > diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
> > index b3edde68bc3e..1d0a0a2a9711 100644
> > --- a/tools/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/tools/arch/arm64/include/uapi/asm/kvm.h
> > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > --
> > 2.34.1
> >
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08 12:23       ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 12:23 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Reiji,

On Tue, Dec 07, 2021 at 07:13:17PM -0800, Reiji Watanabe wrote:
> Hi Alex,
> 
> On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
> >
> > When KVM creates an event and there are more than one PMUs present on the
> > system, perf_init_event() will go through the list of available PMUs and
> > will choose the first one that can create the event. The order of the PMUs
> > in the PMU list depends on the probe order, which can change under various
> > circumstances, for example if the order of the PMU nodes change in the DTB
> > or if asynchronous driver probing is enabled on the kernel command line
> > (with the driver_async_probe=armv8-pmu option).
> >
> > Another consequence of this approach is that, on heteregeneous systems,
> > all virtual machines that KVM creates will use the same PMU. This might
> > cause unexpected behaviour for userspace: when a VCPU is executing on
> > the physical CPU that uses this PMU, PMU events in the guest work
> > correctly; but when the same VCPU executes on another CPU, PMU events in
> > the guest will suddenly stop counting.
> >
> > Fortunately, perf core allows user to specify on which PMU to create an
> > event by using the perf_event_attr->type field, which is used by
> > perf_init_event() as an index in the radix tree of available PMUs.
> >
> > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > the physical CPUs that share this PMU, leaving it up to userspace to
> > manage the VCPU threads' affinity accordingly.
> >
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
> >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> >  arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
> >  include/kvm/arm_pmu.h                   |  1 +
> >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  5 files changed, 63 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > index 60a29972d3f1..c82be5cbc268 100644
> > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > @@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
> >  isn't strictly speaking an event. Filtering the cycle counter is possible
> >  using event 0x11 (CPU_CYCLES).
> >
> > +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> > +------------------------------------------
> > +
> > +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> > +             identifier.
> > +
> > +:Returns:
> > +
> > +        =======  ===============================================
> > +        -EBUSY   PMUv3 already initialized
> > +        -EFAULT  Error accessing the PMU identifier
> > +        -ENXIO   PMU not found
> > +        -ENODEV  PMUv3 not supported or GIC not initialized
> > +        -ENOMEM  Could not allocate memory
> > +        =======  ===============================================
> > +
> > +Request that the VCPU uses the specified hardware PMU when creating guest events
> > +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> > +file for the desired PMU instance under /sys/devices (or, equivalent,
> > +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> > +systems where there are at least two CPU PMUs on the system.
> > +
> > +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > +associated with the PMU specified by this attribute. This is entirely left to
> > +userspace.
> >
> >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> >  =================================
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index b3edde68bc3e..1d0a0a2a9711 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index eaaad4c06561..618138c5f792 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
> >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> >  {
> >         struct kvm_pmu *pmu = &vcpu->arch.pmu;
> > +       struct arm_pmu *arm_pmu = pmu->arm_pmu;
> >         struct kvm_pmc *pmc;
> >         struct perf_event *event;
> >         struct perf_event_attr attr;
> > @@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> >                 return;
> >
> >         memset(&attr, 0, sizeof(struct perf_event_attr));
> > -       attr.type = PERF_TYPE_RAW;
> > -       attr.size = sizeof(attr);
> > +       attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
> >         attr.pinned = 1;
> >         attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
> >         attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> > @@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
> >         return true;
> >  }
> >
> > +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > +{
> > +       struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
> > +       struct arm_pmu_entry *entry;
> > +       struct arm_pmu *arm_pmu;
> > +       int ret = -ENXIO;
> > +
> > +       mutex_lock(&arm_pmus_lock);
> > +
> > +       list_for_each_entry(entry, &arm_pmus, entry) {
> > +               arm_pmu = entry->arm_pmu;
> > +               if (arm_pmu->pmu.type == pmu_id) {
> > +                       kvm_pmu->arm_pmu = arm_pmu;
> 
> Shouldn't kvm->arch.pmuver be updated based on the pmu that
> is used for the guest ?

As far as I can tell, kvm->arch.pmuver is used in kvm_pmu_event_mask() to
get the number of available perf events, which is then used for configuring
events (via the PMEVTYPER_EL0) register or for masking out events when the
guest reads PMCEID{0,1}_EL0; the events that are masked out are the events
that are unsupported by the PMU that perf will choose for creating events.

This series doesn't forbid userspace from setting the PMU for only a subset
of VCPUs, leaving the other VCPUs with the default PMU, so setting
kvm->arch.pmuver to a particular VCPU's PMU is not correct.

I think the correct fix here would be to have kvm_pmu_event_mask() use the
VCPU's PMU PMUVer, and fallback to kvm->arch.pmuver if that isn't set.

This makes me wonder. Should KVM enforce having userspace either not
setting the PMU for any VCPU, either setting it for all VCPUs? I think this
would be a good idea and will reduce complexity in the long run. I also
don't see a use case for userspace choosing to set the PMU for a subset of
VCPUs, leaving the other VCPUs with the default behaviour.

Thanks,
Alex

> 
> Thanks,
> Reiji
> 
> 
> > +                       ret = 0;
> > +                       goto out_unlock;
> > +               }
> > +       }
> > +
> > +out_unlock:
> > +       mutex_unlock(&arm_pmus_lock);
> > +       return ret;
> > +}
> > +
> >  int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >  {
> >         if (!kvm_vcpu_has_pmu(vcpu))
> > @@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >
> >                 return 0;
> >         }
> > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
> > +               int __user *uaddr = (int __user *)(long)attr->addr;
> > +               int pmu_id;
> > +
> > +               if (get_user(pmu_id, uaddr))
> > +                       return -EFAULT;
> > +
> > +               return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
> > +       }
> >         case KVM_ARM_VCPU_PMU_V3_INIT:
> >                 return kvm_arm_pmu_v3_init(vcpu);
> >         }
> > @@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >         case KVM_ARM_VCPU_PMU_V3_IRQ:
> >         case KVM_ARM_VCPU_PMU_V3_INIT:
> >         case KVM_ARM_VCPU_PMU_V3_FILTER:
> > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU:
> >                 if (kvm_vcpu_has_pmu(vcpu))
> >                         return 0;
> >         }
> > diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> > index e249c5f172aa..ab3046a8f9bb 100644
> > --- a/include/kvm/arm_pmu.h
> > +++ b/include/kvm/arm_pmu.h
> > @@ -34,6 +34,7 @@ struct kvm_pmu {
> >         bool created;
> >         bool irq_level;
> >         struct irq_work overflow_work;
> > +       struct arm_pmu *arm_pmu;
> >  };
> >
> >  struct arm_pmu_entry {
> > diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
> > index b3edde68bc3e..1d0a0a2a9711 100644
> > --- a/tools/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/tools/arch/arm64/include/uapi/asm/kvm.h
> > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > --
> > 2.34.1
> >
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-08 12:23       ` Alexandru Elisei
@ 2021-12-08 12:43         ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 12:43 UTC (permalink / raw)
  To: Reiji Watanabe; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi,

On Wed, Dec 08, 2021 at 12:23:44PM +0000, Alexandru Elisei wrote:
> Hi Reiji,
> 
> On Tue, Dec 07, 2021 at 07:13:17PM -0800, Reiji Watanabe wrote:
> > Hi Alex,
> > 
> > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> > >
> > > When KVM creates an event and there are more than one PMUs present on the
> > > system, perf_init_event() will go through the list of available PMUs and
> > > will choose the first one that can create the event. The order of the PMUs
> > > in the PMU list depends on the probe order, which can change under various
> > > circumstances, for example if the order of the PMU nodes change in the DTB
> > > or if asynchronous driver probing is enabled on the kernel command line
> > > (with the driver_async_probe=armv8-pmu option).
> > >
> > > Another consequence of this approach is that, on heteregeneous systems,
> > > all virtual machines that KVM creates will use the same PMU. This might
> > > cause unexpected behaviour for userspace: when a VCPU is executing on
> > > the physical CPU that uses this PMU, PMU events in the guest work
> > > correctly; but when the same VCPU executes on another CPU, PMU events in
> > > the guest will suddenly stop counting.
> > >
> > > Fortunately, perf core allows user to specify on which PMU to create an
> > > event by using the perf_event_attr->type field, which is used by
> > > perf_init_event() as an index in the radix tree of available PMUs.
> > >
> > > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > > the physical CPUs that share this PMU, leaving it up to userspace to
> > > manage the VCPU threads' affinity accordingly.
> > >
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > >  Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
> > >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> > >  arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
> > >  include/kvm/arm_pmu.h                   |  1 +
> > >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> > >  5 files changed, 63 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > index 60a29972d3f1..c82be5cbc268 100644
> > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > @@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
> > >  isn't strictly speaking an event. Filtering the cycle counter is possible
> > >  using event 0x11 (CPU_CYCLES).
> > >
> > > +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > +------------------------------------------
> > > +
> > > +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> > > +             identifier.
> > > +
> > > +:Returns:
> > > +
> > > +        =======  ===============================================
> > > +        -EBUSY   PMUv3 already initialized
> > > +        -EFAULT  Error accessing the PMU identifier
> > > +        -ENXIO   PMU not found
> > > +        -ENODEV  PMUv3 not supported or GIC not initialized
> > > +        -ENOMEM  Could not allocate memory
> > > +        =======  ===============================================
> > > +
> > > +Request that the VCPU uses the specified hardware PMU when creating guest events
> > > +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> > > +file for the desired PMU instance under /sys/devices (or, equivalent,
> > > +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> > > +systems where there are at least two CPU PMUs on the system.
> > > +
> > > +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > > +associated with the PMU specified by this attribute. This is entirely left to
> > > +userspace.
> > >
> > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > >  =================================
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index b3edde68bc3e..1d0a0a2a9711 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> > >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> > >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> > >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> > >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > > index eaaad4c06561..618138c5f792 100644
> > > --- a/arch/arm64/kvm/pmu-emul.c
> > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > @@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
> > >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> > >  {
> > >         struct kvm_pmu *pmu = &vcpu->arch.pmu;
> > > +       struct arm_pmu *arm_pmu = pmu->arm_pmu;
> > >         struct kvm_pmc *pmc;
> > >         struct perf_event *event;
> > >         struct perf_event_attr attr;
> > > @@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> > >                 return;
> > >
> > >         memset(&attr, 0, sizeof(struct perf_event_attr));
> > > -       attr.type = PERF_TYPE_RAW;
> > > -       attr.size = sizeof(attr);
> > > +       attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
> > >         attr.pinned = 1;
> > >         attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
> > >         attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> > > @@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
> > >         return true;
> > >  }
> > >
> > > +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > > +{
> > > +       struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
> > > +       struct arm_pmu_entry *entry;
> > > +       struct arm_pmu *arm_pmu;
> > > +       int ret = -ENXIO;
> > > +
> > > +       mutex_lock(&arm_pmus_lock);
> > > +
> > > +       list_for_each_entry(entry, &arm_pmus, entry) {
> > > +               arm_pmu = entry->arm_pmu;
> > > +               if (arm_pmu->pmu.type == pmu_id) {
> > > +                       kvm_pmu->arm_pmu = arm_pmu;
> > 
> > Shouldn't kvm->arch.pmuver be updated based on the pmu that
> > is used for the guest ?
> 
> As far as I can tell, kvm->arch.pmuver is used in kvm_pmu_event_mask() to
> get the number of available perf events, which is then used for configuring
> events (via the PMEVTYPER_EL0) register or for masking out events when the
> guest reads PMCEID{0,1}_EL0; the events that are masked out are the events
> that are unsupported by the PMU that perf will choose for creating events.
> 
> This series doesn't forbid userspace from setting the PMU for only a subset
> of VCPUs, leaving the other VCPUs with the default PMU, so setting
> kvm->arch.pmuver to a particular VCPU's PMU is not correct.
> 
> I think the correct fix here would be to have kvm_pmu_event_mask() use the
> VCPU's PMU PMUVer, and fallback to kvm->arch.pmuver if that isn't set.
> 
> This makes me wonder. Should KVM enforce having userspace either not
> setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> would be a good idea and will reduce complexity in the long run. I also
> don't see a use case for userspace choosing to set the PMU for a subset of
> VCPUs, leaving the other VCPUs with the default behaviour.

I had a look and I don't think there's a way to enforce this, as there are
no restrictions on when a VCPU can be created. KVM must support only a
subset of VCPUs having a PMU set.

Thanks,
Alex

> 
> Thanks,
> Alex
> 
> > 
> > Thanks,
> > Reiji
> > 
> > 
> > > +                       ret = 0;
> > > +                       goto out_unlock;
> > > +               }
> > > +       }
> > > +
> > > +out_unlock:
> > > +       mutex_unlock(&arm_pmus_lock);
> > > +       return ret;
> > > +}
> > > +
> > >  int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > >  {
> > >         if (!kvm_vcpu_has_pmu(vcpu))
> > > @@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > >
> > >                 return 0;
> > >         }
> > > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
> > > +               int __user *uaddr = (int __user *)(long)attr->addr;
> > > +               int pmu_id;
> > > +
> > > +               if (get_user(pmu_id, uaddr))
> > > +                       return -EFAULT;
> > > +
> > > +               return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
> > > +       }
> > >         case KVM_ARM_VCPU_PMU_V3_INIT:
> > >                 return kvm_arm_pmu_v3_init(vcpu);
> > >         }
> > > @@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > >         case KVM_ARM_VCPU_PMU_V3_IRQ:
> > >         case KVM_ARM_VCPU_PMU_V3_INIT:
> > >         case KVM_ARM_VCPU_PMU_V3_FILTER:
> > > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU:
> > >                 if (kvm_vcpu_has_pmu(vcpu))
> > >                         return 0;
> > >         }
> > > diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> > > index e249c5f172aa..ab3046a8f9bb 100644
> > > --- a/include/kvm/arm_pmu.h
> > > +++ b/include/kvm/arm_pmu.h
> > > @@ -34,6 +34,7 @@ struct kvm_pmu {
> > >         bool created;
> > >         bool irq_level;
> > >         struct irq_work overflow_work;
> > > +       struct arm_pmu *arm_pmu;
> > >  };
> > >
> > >  struct arm_pmu_entry {
> > > diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
> > > index b3edde68bc3e..1d0a0a2a9711 100644
> > > --- a/tools/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/tools/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> > >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> > >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> > >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> > >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > > --
> > > 2.34.1
> > >
> > > _______________________________________________
> > > kvmarm mailing list
> > > kvmarm@lists.cs.columbia.edu
> > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08 12:43         ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 12:43 UTC (permalink / raw)
  To: Reiji Watanabe; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi,

On Wed, Dec 08, 2021 at 12:23:44PM +0000, Alexandru Elisei wrote:
> Hi Reiji,
> 
> On Tue, Dec 07, 2021 at 07:13:17PM -0800, Reiji Watanabe wrote:
> > Hi Alex,
> > 
> > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> > >
> > > When KVM creates an event and there are more than one PMUs present on the
> > > system, perf_init_event() will go through the list of available PMUs and
> > > will choose the first one that can create the event. The order of the PMUs
> > > in the PMU list depends on the probe order, which can change under various
> > > circumstances, for example if the order of the PMU nodes change in the DTB
> > > or if asynchronous driver probing is enabled on the kernel command line
> > > (with the driver_async_probe=armv8-pmu option).
> > >
> > > Another consequence of this approach is that, on heteregeneous systems,
> > > all virtual machines that KVM creates will use the same PMU. This might
> > > cause unexpected behaviour for userspace: when a VCPU is executing on
> > > the physical CPU that uses this PMU, PMU events in the guest work
> > > correctly; but when the same VCPU executes on another CPU, PMU events in
> > > the guest will suddenly stop counting.
> > >
> > > Fortunately, perf core allows user to specify on which PMU to create an
> > > event by using the perf_event_attr->type field, which is used by
> > > perf_init_event() as an index in the radix tree of available PMUs.
> > >
> > > Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
> > > attribute to allow userspace to specify the arm_pmu that KVM will use when
> > > creating events for that VCPU. KVM will make no attempt to run the VCPU on
> > > the physical CPUs that share this PMU, leaving it up to userspace to
> > > manage the VCPU threads' affinity accordingly.
> > >
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > >  Documentation/virt/kvm/devices/vcpu.rst | 25 +++++++++++++++++
> > >  arch/arm64/include/uapi/asm/kvm.h       |  1 +
> > >  arch/arm64/kvm/pmu-emul.c               | 37 +++++++++++++++++++++++--
> > >  include/kvm/arm_pmu.h                   |  1 +
> > >  tools/arch/arm64/include/uapi/asm/kvm.h |  1 +
> > >  5 files changed, 63 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > index 60a29972d3f1..c82be5cbc268 100644
> > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > @@ -104,6 +104,31 @@ hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
> > >  isn't strictly speaking an event. Filtering the cycle counter is possible
> > >  using event 0x11 (CPU_CYCLES).
> > >
> > > +1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > +------------------------------------------
> > > +
> > > +:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
> > > +             identifier.
> > > +
> > > +:Returns:
> > > +
> > > +        =======  ===============================================
> > > +        -EBUSY   PMUv3 already initialized
> > > +        -EFAULT  Error accessing the PMU identifier
> > > +        -ENXIO   PMU not found
> > > +        -ENODEV  PMUv3 not supported or GIC not initialized
> > > +        -ENOMEM  Could not allocate memory
> > > +        =======  ===============================================
> > > +
> > > +Request that the VCPU uses the specified hardware PMU when creating guest events
> > > +for the purpose of PMU emulation. The PMU identifier can be read from the "type"
> > > +file for the desired PMU instance under /sys/devices (or, equivalent,
> > > +/sys/bus/even_source). This attribute is particularly useful on heterogeneous
> > > +systems where there are at least two CPU PMUs on the system.
> > > +
> > > +Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > > +associated with the PMU specified by this attribute. This is entirely left to
> > > +userspace.
> > >
> > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > >  =================================
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index b3edde68bc3e..1d0a0a2a9711 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> > >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> > >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> > >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> > >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > > index eaaad4c06561..618138c5f792 100644
> > > --- a/arch/arm64/kvm/pmu-emul.c
> > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > @@ -603,6 +603,7 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_vcpu *vcpu, u64 select_idx)
> > >  static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> > >  {
> > >         struct kvm_pmu *pmu = &vcpu->arch.pmu;
> > > +       struct arm_pmu *arm_pmu = pmu->arm_pmu;
> > >         struct kvm_pmc *pmc;
> > >         struct perf_event *event;
> > >         struct perf_event_attr attr;
> > > @@ -638,8 +639,7 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
> > >                 return;
> > >
> > >         memset(&attr, 0, sizeof(struct perf_event_attr));
> > > -       attr.type = PERF_TYPE_RAW;
> > > -       attr.size = sizeof(attr);
> > > +       attr.type = arm_pmu ? arm_pmu->pmu.type : PERF_TYPE_RAW;
> > >         attr.pinned = 1;
> > >         attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, pmc->idx);
> > >         attr.exclude_user = data & ARMV8_PMU_EXCLUDE_EL0 ? 1 : 0;
> > > @@ -941,6 +941,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
> > >         return true;
> > >  }
> > >
> > > +static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
> > > +{
> > > +       struct kvm_pmu *kvm_pmu = &vcpu->arch.pmu;
> > > +       struct arm_pmu_entry *entry;
> > > +       struct arm_pmu *arm_pmu;
> > > +       int ret = -ENXIO;
> > > +
> > > +       mutex_lock(&arm_pmus_lock);
> > > +
> > > +       list_for_each_entry(entry, &arm_pmus, entry) {
> > > +               arm_pmu = entry->arm_pmu;
> > > +               if (arm_pmu->pmu.type == pmu_id) {
> > > +                       kvm_pmu->arm_pmu = arm_pmu;
> > 
> > Shouldn't kvm->arch.pmuver be updated based on the pmu that
> > is used for the guest ?
> 
> As far as I can tell, kvm->arch.pmuver is used in kvm_pmu_event_mask() to
> get the number of available perf events, which is then used for configuring
> events (via the PMEVTYPER_EL0) register or for masking out events when the
> guest reads PMCEID{0,1}_EL0; the events that are masked out are the events
> that are unsupported by the PMU that perf will choose for creating events.
> 
> This series doesn't forbid userspace from setting the PMU for only a subset
> of VCPUs, leaving the other VCPUs with the default PMU, so setting
> kvm->arch.pmuver to a particular VCPU's PMU is not correct.
> 
> I think the correct fix here would be to have kvm_pmu_event_mask() use the
> VCPU's PMU PMUVer, and fallback to kvm->arch.pmuver if that isn't set.
> 
> This makes me wonder. Should KVM enforce having userspace either not
> setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> would be a good idea and will reduce complexity in the long run. I also
> don't see a use case for userspace choosing to set the PMU for a subset of
> VCPUs, leaving the other VCPUs with the default behaviour.

I had a look and I don't think there's a way to enforce this, as there are
no restrictions on when a VCPU can be created. KVM must support only a
subset of VCPUs having a PMU set.

Thanks,
Alex

> 
> Thanks,
> Alex
> 
> > 
> > Thanks,
> > Reiji
> > 
> > 
> > > +                       ret = 0;
> > > +                       goto out_unlock;
> > > +               }
> > > +       }
> > > +
> > > +out_unlock:
> > > +       mutex_unlock(&arm_pmus_lock);
> > > +       return ret;
> > > +}
> > > +
> > >  int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > >  {
> > >         if (!kvm_vcpu_has_pmu(vcpu))
> > > @@ -1027,6 +1050,15 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > >
> > >                 return 0;
> > >         }
> > > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
> > > +               int __user *uaddr = (int __user *)(long)attr->addr;
> > > +               int pmu_id;
> > > +
> > > +               if (get_user(pmu_id, uaddr))
> > > +                       return -EFAULT;
> > > +
> > > +               return kvm_arm_pmu_v3_set_pmu(vcpu, pmu_id);
> > > +       }
> > >         case KVM_ARM_VCPU_PMU_V3_INIT:
> > >                 return kvm_arm_pmu_v3_init(vcpu);
> > >         }
> > > @@ -1064,6 +1096,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > >         case KVM_ARM_VCPU_PMU_V3_IRQ:
> > >         case KVM_ARM_VCPU_PMU_V3_INIT:
> > >         case KVM_ARM_VCPU_PMU_V3_FILTER:
> > > +       case KVM_ARM_VCPU_PMU_V3_SET_PMU:
> > >                 if (kvm_vcpu_has_pmu(vcpu))
> > >                         return 0;
> > >         }
> > > diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> > > index e249c5f172aa..ab3046a8f9bb 100644
> > > --- a/include/kvm/arm_pmu.h
> > > +++ b/include/kvm/arm_pmu.h
> > > @@ -34,6 +34,7 @@ struct kvm_pmu {
> > >         bool created;
> > >         bool irq_level;
> > >         struct irq_work overflow_work;
> > > +       struct arm_pmu *arm_pmu;
> > >  };
> > >
> > >  struct arm_pmu_entry {
> > > diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
> > > index b3edde68bc3e..1d0a0a2a9711 100644
> > > --- a/tools/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/tools/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -362,6 +362,7 @@ struct kvm_arm_copy_mte_tags {
> > >  #define   KVM_ARM_VCPU_PMU_V3_IRQ      0
> > >  #define   KVM_ARM_VCPU_PMU_V3_INIT     1
> > >  #define   KVM_ARM_VCPU_PMU_V3_FILTER   2
> > > +#define   KVM_ARM_VCPU_PMU_V3_SET_PMU  3
> > >  #define KVM_ARM_VCPU_TIMER_CTRL                1
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER                0
> > >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER                1
> > > --
> > > 2.34.1
> > >
> > > _______________________________________________
> > > kvmarm mailing list
> > > kvmarm@lists.cs.columbia.edu
> > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-08 12:23       ` Alexandru Elisei
@ 2021-12-08 14:25         ` Marc Zyngier
  -1 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08 14:25 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Wed, 08 Dec 2021 12:23:44 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> This makes me wonder. Should KVM enforce having userspace either not
> setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> would be a good idea and will reduce complexity in the long run. I also
> don't see a use case for userspace choosing to set the PMU for a subset of
> VCPUs, leaving the other VCPUs with the default behaviour.

Indeed. As much as I'm happy to expose a PMU to a guest on an
asymmetric system, I really do not want the asymmetry in the guest
itself. So this should be an all or nothing behaviour.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08 14:25         ` Marc Zyngier
  0 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08 14:25 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Reiji Watanabe, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

On Wed, 08 Dec 2021 12:23:44 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> This makes me wonder. Should KVM enforce having userspace either not
> setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> would be a good idea and will reduce complexity in the long run. I also
> don't see a use case for userspace choosing to set the PMU for a subset of
> VCPUs, leaving the other VCPUs with the default behaviour.

Indeed. As much as I'm happy to expose a PMU to a guest on an
asymmetric system, I really do not want the asymmetry in the guest
itself. So this should be an all or nothing behaviour.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-08 14:25         ` Marc Zyngier
@ 2021-12-08 15:20           ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 15:20 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Marc,

On Wed, Dec 08, 2021 at 02:25:58PM +0000, Marc Zyngier wrote:
> On Wed, 08 Dec 2021 12:23:44 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > This makes me wonder. Should KVM enforce having userspace either not
> > setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> > would be a good idea and will reduce complexity in the long run. I also
> > don't see a use case for userspace choosing to set the PMU for a subset of
> > VCPUs, leaving the other VCPUs with the default behaviour.
> 
> Indeed. As much as I'm happy to expose a PMU to a guest on an
> asymmetric system, I really do not want the asymmetry in the guest
> itself. So this should be an all or nothing behaviour.

From what I can tell, the only asymmetry that can be exposed to a guest as
a result of the series is the number of events supported on a VCPU.

I don't like the idea of forcing userspace to set the *same* PMU for all
VCPUs, as that would severely limit running VMs with PMU on asymmetric
systems.

Even if KVM forces to set a PMU (does not have to be the same PMU) for all
VCPUs, that still does not look like the correct solution for me, because
userspace can set PMUs with different number of events.

What I can try is to make kvm->arch.pmuver the minimum version of all the
VCPU PMUs and the implict PMU. I'll give that a go in the next iteration.

Thanks,
Alex

> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08 15:20           ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 15:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Reiji Watanabe, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Marc,

On Wed, Dec 08, 2021 at 02:25:58PM +0000, Marc Zyngier wrote:
> On Wed, 08 Dec 2021 12:23:44 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > This makes me wonder. Should KVM enforce having userspace either not
> > setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> > would be a good idea and will reduce complexity in the long run. I also
> > don't see a use case for userspace choosing to set the PMU for a subset of
> > VCPUs, leaving the other VCPUs with the default behaviour.
> 
> Indeed. As much as I'm happy to expose a PMU to a guest on an
> asymmetric system, I really do not want the asymmetry in the guest
> itself. So this should be an all or nothing behaviour.

From what I can tell, the only asymmetry that can be exposed to a guest as
a result of the series is the number of events supported on a VCPU.

I don't like the idea of forcing userspace to set the *same* PMU for all
VCPUs, as that would severely limit running VMs with PMU on asymmetric
systems.

Even if KVM forces to set a PMU (does not have to be the same PMU) for all
VCPUs, that still does not look like the correct solution for me, because
userspace can set PMUs with different number of events.

What I can try is to make kvm->arch.pmuver the minimum version of all the
VCPU PMUs and the implict PMU. I'll give that a go in the next iteration.

Thanks,
Alex

> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-08 15:20           ` Alexandru Elisei
@ 2021-12-08 15:44             ` Marc Zyngier
  -1 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08 15:44 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Wed, 08 Dec 2021 15:20:30 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Wed, Dec 08, 2021 at 02:25:58PM +0000, Marc Zyngier wrote:
> > On Wed, 08 Dec 2021 12:23:44 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > This makes me wonder. Should KVM enforce having userspace either not
> > > setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> > > would be a good idea and will reduce complexity in the long run. I also
> > > don't see a use case for userspace choosing to set the PMU for a subset of
> > > VCPUs, leaving the other VCPUs with the default behaviour.
> > 
> > Indeed. As much as I'm happy to expose a PMU to a guest on an
> > asymmetric system, I really do not want the asymmetry in the guest
> > itself. So this should be an all or nothing behaviour.
> 
> From what I can tell, the only asymmetry that can be exposed to a guest as
> a result of the series is the number of events supported on a VCPU.

Not only. It means that the events are counting different things. It
isn't only about pmuver, which is only about the architectural
revision implemented by the PMU. If you start assigning two different
PMUs (in the perf sense) to a guest, you open the Pandora box of
having to deal with all the subtle nonsense that asymmetric systems
bring. What about event filtering, for example?

> I don't like the idea of forcing userspace to set the *same* PMU for all
> VCPUs, as that would severely limit running VMs with PMU on asymmetric
> systems.

On the contrary, I am *very* happy to limit a VM to a single PMU (and
thus CPU) type on these systems. Really.

> Even if KVM forces to set a PMU (does not have to be the same PMU) for all
> VCPUs, that still does not look like the correct solution for me, because
> userspace can set PMUs with different number of events.

I don't understand what you mean. If you associate a single PMU type
to the guest, that's all the guest sees.

> What I can try is to make kvm->arch.pmuver the minimum version of all the
> VCPU PMUs and the implict PMU. I'll give that a go in the next iteration.

I really don't think we need any of this.

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08 15:44             ` Marc Zyngier
  0 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08 15:44 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Reiji Watanabe, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

On Wed, 08 Dec 2021 15:20:30 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Wed, Dec 08, 2021 at 02:25:58PM +0000, Marc Zyngier wrote:
> > On Wed, 08 Dec 2021 12:23:44 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > This makes me wonder. Should KVM enforce having userspace either not
> > > setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> > > would be a good idea and will reduce complexity in the long run. I also
> > > don't see a use case for userspace choosing to set the PMU for a subset of
> > > VCPUs, leaving the other VCPUs with the default behaviour.
> > 
> > Indeed. As much as I'm happy to expose a PMU to a guest on an
> > asymmetric system, I really do not want the asymmetry in the guest
> > itself. So this should be an all or nothing behaviour.
> 
> From what I can tell, the only asymmetry that can be exposed to a guest as
> a result of the series is the number of events supported on a VCPU.

Not only. It means that the events are counting different things. It
isn't only about pmuver, which is only about the architectural
revision implemented by the PMU. If you start assigning two different
PMUs (in the perf sense) to a guest, you open the Pandora box of
having to deal with all the subtle nonsense that asymmetric systems
bring. What about event filtering, for example?

> I don't like the idea of forcing userspace to set the *same* PMU for all
> VCPUs, as that would severely limit running VMs with PMU on asymmetric
> systems.

On the contrary, I am *very* happy to limit a VM to a single PMU (and
thus CPU) type on these systems. Really.

> Even if KVM forces to set a PMU (does not have to be the same PMU) for all
> VCPUs, that still does not look like the correct solution for me, because
> userspace can set PMUs with different number of events.

I don't understand what you mean. If you associate a single PMU type
to the guest, that's all the guest sees.

> What I can try is to make kvm->arch.pmuver the minimum version of all the
> VCPU PMUs and the implict PMU. I'll give that a go in the next iteration.

I really don't think we need any of this.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-08 15:44             ` Marc Zyngier
@ 2021-12-08 16:11               ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 16:11 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Marc,

On Wed, Dec 08, 2021 at 03:44:35PM +0000, Marc Zyngier wrote:
> On Wed, 08 Dec 2021 15:20:30 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On Wed, Dec 08, 2021 at 02:25:58PM +0000, Marc Zyngier wrote:
> > > On Wed, 08 Dec 2021 12:23:44 +0000,
> > > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > > 
> > > > This makes me wonder. Should KVM enforce having userspace either not
> > > > setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> > > > would be a good idea and will reduce complexity in the long run. I also
> > > > don't see a use case for userspace choosing to set the PMU for a subset of
> > > > VCPUs, leaving the other VCPUs with the default behaviour.
> > > 
> > > Indeed. As much as I'm happy to expose a PMU to a guest on an
> > > asymmetric system, I really do not want the asymmetry in the guest
> > > itself. So this should be an all or nothing behaviour.
> > 
> > From what I can tell, the only asymmetry that can be exposed to a guest as
> > a result of the series is the number of events supported on a VCPU.
> 
> Not only. It means that the events are counting different things. It
> isn't only about pmuver, which is only about the architectural
> revision implemented by the PMU. If you start assigning two different
> PMUs (in the perf sense) to a guest, you open the Pandora box of
> having to deal with all the subtle nonsense that asymmetric systems
> bring. What about event filtering, for example?

kvm_pmu_set_counter_event_type() uses the number of events to mask out the
unsupported events, so still depends on pmuver.

But I understand what you are saying, there might be differences between what
exactly an event is counting, how it increments and how the counter value should
be interpreted based on the microarchitecture.

> 
> > I don't like the idea of forcing userspace to set the *same* PMU for all
> > VCPUs, as that would severely limit running VMs with PMU on asymmetric
> > systems.
> 
> On the contrary, I am *very* happy to limit a VM to a single PMU (and
> thus CPU) type on these systems. Really.

Ok, so any kind of asymmetry is unacceptable.

Accepted behaviour:

1. If userspace sets PMU for one VCPU, then *all* other VCPUs must have a PMU
set, and furthermore, it must be the same PMU as the first VCPU,

or

2. If userspace has initialized a PMU (via
KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_INIT)) without setting a PMU, then
it is forbidden for userspace to set a PMU for the other VCPUs.

Is that what you had in mind?

> 
> > Even if KVM forces to set a PMU (does not have to be the same PMU) for all
> > VCPUs, that still does not look like the correct solution for me, because
> > userspace can set PMUs with different number of events.
> 
> I don't understand what you mean. If you associate a single PMU type
> to the guest, that's all the guest sees.

I was talking in the context of allowing userspace to associate different PMUs
to different VCPUs.

Thanks,
Alex

> 
> > What I can try is to make kvm->arch.pmuver the minimum version of all the
> > VCPU PMUs and the implict PMU. I'll give that a go in the next iteration.
> 
> I really don't think we need any of this.
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08 16:11               ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-08 16:11 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Reiji Watanabe, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Marc,

On Wed, Dec 08, 2021 at 03:44:35PM +0000, Marc Zyngier wrote:
> On Wed, 08 Dec 2021 15:20:30 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On Wed, Dec 08, 2021 at 02:25:58PM +0000, Marc Zyngier wrote:
> > > On Wed, 08 Dec 2021 12:23:44 +0000,
> > > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > > 
> > > > This makes me wonder. Should KVM enforce having userspace either not
> > > > setting the PMU for any VCPU, either setting it for all VCPUs? I think this
> > > > would be a good idea and will reduce complexity in the long run. I also
> > > > don't see a use case for userspace choosing to set the PMU for a subset of
> > > > VCPUs, leaving the other VCPUs with the default behaviour.
> > > 
> > > Indeed. As much as I'm happy to expose a PMU to a guest on an
> > > asymmetric system, I really do not want the asymmetry in the guest
> > > itself. So this should be an all or nothing behaviour.
> > 
> > From what I can tell, the only asymmetry that can be exposed to a guest as
> > a result of the series is the number of events supported on a VCPU.
> 
> Not only. It means that the events are counting different things. It
> isn't only about pmuver, which is only about the architectural
> revision implemented by the PMU. If you start assigning two different
> PMUs (in the perf sense) to a guest, you open the Pandora box of
> having to deal with all the subtle nonsense that asymmetric systems
> bring. What about event filtering, for example?

kvm_pmu_set_counter_event_type() uses the number of events to mask out the
unsupported events, so still depends on pmuver.

But I understand what you are saying, there might be differences between what
exactly an event is counting, how it increments and how the counter value should
be interpreted based on the microarchitecture.

> 
> > I don't like the idea of forcing userspace to set the *same* PMU for all
> > VCPUs, as that would severely limit running VMs with PMU on asymmetric
> > systems.
> 
> On the contrary, I am *very* happy to limit a VM to a single PMU (and
> thus CPU) type on these systems. Really.

Ok, so any kind of asymmetry is unacceptable.

Accepted behaviour:

1. If userspace sets PMU for one VCPU, then *all* other VCPUs must have a PMU
set, and furthermore, it must be the same PMU as the first VCPU,

or

2. If userspace has initialized a PMU (via
KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_INIT)) without setting a PMU, then
it is forbidden for userspace to set a PMU for the other VCPUs.

Is that what you had in mind?

> 
> > Even if KVM forces to set a PMU (does not have to be the same PMU) for all
> > VCPUs, that still does not look like the correct solution for me, because
> > userspace can set PMUs with different number of events.
> 
> I don't understand what you mean. If you associate a single PMU type
> to the guest, that's all the guest sees.

I was talking in the context of allowing userspace to associate different PMUs
to different VCPUs.

Thanks,
Alex

> 
> > What I can try is to make kvm->arch.pmuver the minimum version of all the
> > VCPU PMUs and the implict PMU. I'll give that a go in the next iteration.
> 
> I really don't think we need any of this.
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  2021-12-08 16:11               ` Alexandru Elisei
@ 2021-12-08 16:21                 ` Marc Zyngier
  -1 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08 16:21 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: mingo, tglx, will, kvmarm, linux-arm-kernel

On Wed, 08 Dec 2021 16:11:13 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> > On the contrary, I am *very* happy to limit a VM to a single PMU (and
> > thus CPU) type on these systems. Really.
> 
> Ok, so any kind of asymmetry is unacceptable.
> 
> Accepted behaviour:
> 
> 1. If userspace sets PMU for one VCPU, then *all* other VCPUs must
> have a PMU set, and furthermore, it must be the same PMU as the
> first VCPU,
> 
> or
> 
> 2. If userspace has initialized a PMU (via
> KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_INIT)) without setting
> a PMU, then it is forbidden for userspace to set a PMU for the other
> VCPUs.
> 
> Is that what you had in mind?

Exactly. This sidesteps any sort of odd behaviour by forcing userspace
to pick a side.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
@ 2021-12-08 16:21                 ` Marc Zyngier
  0 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-08 16:21 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Reiji Watanabe, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

On Wed, 08 Dec 2021 16:11:13 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> > On the contrary, I am *very* happy to limit a VM to a single PMU (and
> > thus CPU) type on these systems. Really.
> 
> Ok, so any kind of asymmetry is unacceptable.
> 
> Accepted behaviour:
> 
> 1. If userspace sets PMU for one VCPU, then *all* other VCPUs must
> have a PMU set, and furthermore, it must be the same PMU as the
> first VCPU,
> 
> or
> 
> 2. If userspace has initialized a PMU (via
> KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_INIT)) without setting
> a PMU, then it is forbidden for userspace to set a PMU for the other
> VCPUs.
> 
> Is that what you had in mind?

Exactly. This sidesteps any sort of odd behaviour by forcing userspace
to pick a side.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-08  8:05     ` Marc Zyngier
@ 2021-12-13  6:36       ` Reiji Watanabe
  -1 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-13  6:36 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: will, mingo, tglx, kvmarm, linux-arm-kernel

On Wed, Dec 8, 2021 at 12:05 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Reji,
>
> On 2021-12-08 02:36, Reiji Watanabe wrote:
> > Hi Alex,
> >
> > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> >>
> >> (CC'ing Peter Maydell in case this might be of interest to qemu)
> >>
> >> The series can be found on a branch at [1], and the kvmtool support at
> >> [2].
> >> The kvmtool patches are also on the mailing list [3] and haven't
> >> changed
> >> since v1.
> >>
> >> Detailed explanation of the issue and symptoms that the patches
> >> attempt to
> >> correct can be found in the cover letter for v1 [4].
> >>
> >> A brief summary of the problem is that on heterogeneous systems KVM
> >> will
> >> always use the same PMU for creating the VCPU events for *all* VCPUs
> >> regardless of the physical CPU on which the VCPU is running, leading
> >> to
> >> events suddenly stopping and resuming in the guest as the VCPU thread
> >> gets
> >> migrated across different CPUs.
> >>
> >> This series proposes to fix this behaviour by allowing the user to
> >> specify
> >> which physical PMU is used when creating the VCPU events needed for
> >> guest
> >> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> >> physical which is not part of the supported CPUs for the specified
> >> PMU.
> >
> > Just to confirm, this series provides an API for userspace to request
> > KVM to detect a wrong affinity setting due to a userspace bug so that
> > userspace can get an error at KVM_RUN instead of leading to events
> > suddenly stopping, correct ?
>
> More than that, it allows userspace to select which PMU will be used
> for their guest. The affinity setting is a byproduct of the PMU's own
> affinity.

Thank you for the clarification.
(I overlooked the change in kvm_pmu_create_perf_event()...)


> >
> >> The default behaviour stays the same - without userspace setting the
> >> PMU,
> >> events will stop counting if the VCPU is scheduled on the wrong CPU.
> >
> > Can't we fix the default behavior (in addition to the current fix) ?
> > (Do we need to maintain the default behavior ??)
>
> Of course we do. This is a behaviour that has been exposed to userspace
> for years, and *we don't break userspace*.

I'm wondering if it might be better to have kvm_pmu_create_perf_event()
set attr.type to pmu_id based on the current (physical) CPU by default
on such heterogeneous systems (even if userspace don't explicitly
specify pmu_id with the new API).  Then, by setting the CPU affinity,
the PMU in that environment can behave predictably even with existing
userspace (or maybe this won't be helpful at all?).

Thanks,
Reiji
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-13  6:36       ` Reiji Watanabe
  0 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-13  6:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Alexandru Elisei, james.morse, suzuki.poulose, will,
	mark.rutland, linux-arm-kernel, kvmarm, tglx, mingo

On Wed, Dec 8, 2021 at 12:05 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Reji,
>
> On 2021-12-08 02:36, Reiji Watanabe wrote:
> > Hi Alex,
> >
> > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> >>
> >> (CC'ing Peter Maydell in case this might be of interest to qemu)
> >>
> >> The series can be found on a branch at [1], and the kvmtool support at
> >> [2].
> >> The kvmtool patches are also on the mailing list [3] and haven't
> >> changed
> >> since v1.
> >>
> >> Detailed explanation of the issue and symptoms that the patches
> >> attempt to
> >> correct can be found in the cover letter for v1 [4].
> >>
> >> A brief summary of the problem is that on heterogeneous systems KVM
> >> will
> >> always use the same PMU for creating the VCPU events for *all* VCPUs
> >> regardless of the physical CPU on which the VCPU is running, leading
> >> to
> >> events suddenly stopping and resuming in the guest as the VCPU thread
> >> gets
> >> migrated across different CPUs.
> >>
> >> This series proposes to fix this behaviour by allowing the user to
> >> specify
> >> which physical PMU is used when creating the VCPU events needed for
> >> guest
> >> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> >> physical which is not part of the supported CPUs for the specified
> >> PMU.
> >
> > Just to confirm, this series provides an API for userspace to request
> > KVM to detect a wrong affinity setting due to a userspace bug so that
> > userspace can get an error at KVM_RUN instead of leading to events
> > suddenly stopping, correct ?
>
> More than that, it allows userspace to select which PMU will be used
> for their guest. The affinity setting is a byproduct of the PMU's own
> affinity.

Thank you for the clarification.
(I overlooked the change in kvm_pmu_create_perf_event()...)


> >
> >> The default behaviour stays the same - without userspace setting the
> >> PMU,
> >> events will stop counting if the VCPU is scheduled on the wrong CPU.
> >
> > Can't we fix the default behavior (in addition to the current fix) ?
> > (Do we need to maintain the default behavior ??)
>
> Of course we do. This is a behaviour that has been exposed to userspace
> for years, and *we don't break userspace*.

I'm wondering if it might be better to have kvm_pmu_create_perf_event()
set attr.type to pmu_id based on the current (physical) CPU by default
on such heterogeneous systems (even if userspace don't explicitly
specify pmu_id with the new API).  Then, by setting the CPU affinity,
the PMU in that environment can behave predictably even with existing
userspace (or maybe this won't be helpful at all?).

Thanks,
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  2021-12-08 10:38         ` Alexandru Elisei
@ 2021-12-13  7:40           ` Reiji Watanabe
  -1 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-13  7:40 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: maz, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Alex,

On Wed, Dec 8, 2021 at 2:38 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi Reiji,
>
> Thank you for the review!
>
> On Tue, Dec 07, 2021 at 11:54:51PM -0800, Reiji Watanabe wrote:
> > Hi Alex,
> >
> > On Tue, Dec 7, 2021 at 6:18 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > > > different PMU, the perf events needed to emulate a guest PMU won't be
> > > > scheduled in and the guest performance counters will stop counting. Treat
> > > > it as an userspace error and refuse to run the VCPU in this situation.
> > > >
> > > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > > > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > > > instead of in vcpu_put(); this has been done on purpose so the error
> > > > condition is communicated as soon as possible to userspace, otherwise
> > > > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > > >
> > > > Suggested-by: Marc Zyngier <maz@kernel.org>
> > > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > ---
> > > > I agonized for hours about the best name for the VCPU flag and the
> > > > accessors. If someone has a better idea, please tell me and I'll change
> > > > them.
> > > >
> > > >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> > > >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> > > >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> > > >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> > > >  arch/arm64/kvm/pmu-emul.c               |  1 +
> > > >  5 files changed, 40 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > > index c82be5cbc268..9ae47b7c3652 100644
> > > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> > > >
> > > >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > > >  associated with the PMU specified by this attribute. This is entirely left to
> > > > -userspace.
> > > > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > > > +by the PMU will fail and KVM_RUN will return with
> > > > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > > > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > > > +the cpu field to the processor id.
> > > >
> > > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > > >  =================================
> > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > > index 2a5f7f38006f..0c453f2e48b6 100644
> > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> > > >               u64 last_steal;
> > > >               gpa_t base;
> > > >       } steal;
> > > > +
> > > > +     cpumask_var_t supported_cpus;
> > > >  };
> > > >
> > > >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > > > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> > > >  #define KVM_ARM64_EXCEPT_MASK                (7 << 9) /* Target EL/MODE */
> > > >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE       (1 << 12) /* Save SPE context if active  */
> > > >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE      (1 << 13) /* Save TRBE context if active  */
> > > > +#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 14) /* Physical CPU not in supported_cpus */
> > > >
> > > >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> > > >                                KVM_GUESTDBG_USE_SW_BP | \
> > > > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> > > >  #define vcpu_has_ptrauth(vcpu)               false
> > > >  #endif
> > > >
> > > > +#define vcpu_on_unsupported_cpu(vcpu)                                        \
> > > > +     ((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > > +
> > > > +#define vcpu_set_on_unsupported_cpu(vcpu)                            \
> > > > +     ((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > > +
> > > > +#define vcpu_clear_on_unsupported_cpu(vcpu)                          \
> > > > +     ((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > > +
> > > >  #define vcpu_gp_regs(v)              (&(v)->arch.ctxt.regs)
> > > >
> > > >  /*
> > > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > > index 1d0a0a2a9711..d49f714f48e6 100644
> > > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> > > >  #define KVM_PSCI_RET_INVAL           PSCI_RET_INVALID_PARAMS
> > > >  #define KVM_PSCI_RET_DENIED          PSCI_RET_DENIED
> > > >
> > > > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > > > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED  (1ULL << 0)
> > > > +
> > > >  #endif
> > > >
> > > >  #endif /* __ARM_KVM_H__ */
> > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > index e4727dc771bf..1124c3efdd94 100644
> > > > --- a/arch/arm64/kvm/arm.c
> > > > +++ b/arch/arm64/kvm/arm.c
> > > > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > > >
> > > >       vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> > > >
> > > > +     if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > > > +             return -ENOMEM;
> >
> > It appears that vcpu->arch.supported_cpus needs to be freed
> > if kvm_arch_vcpu_create() fails after it is allocated.
> > (kvm_vgic_vcpu_init() or create_hyp_mappings() might fail)
>
> I missed that, thank you for pointing it out.
>
> >
> >
> > > > +     cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> > > > +
> > > >       /* Set up the timer */
> > > >       kvm_timer_vcpu_init(vcpu);
> > > >
> > > > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > > >       if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> > > >               static_branch_dec(&userspace_irqchip_in_use);
> > > >
> > > > +     free_cpumask_var(vcpu->arch.supported_cpus);
> > > >       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > > >       kvm_timer_vcpu_terminate(vcpu);
> > > >       kvm_pmu_vcpu_destroy(vcpu);
> > > > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > > >       if (vcpu_has_ptrauth(vcpu))
> > > >               vcpu_ptrauth_disable(vcpu);
> > > >       kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > > > +
> > > > +     if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > > > +             vcpu_set_on_unsupported_cpu(vcpu);
> > > >  }
> > > >
> > > >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > >                */
> > > >               preempt_disable();
> > > >
> > > > +             if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > > > +                     vcpu_clear_on_unsupported_cpu(vcpu);
> > > > +                     run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > > > +                     run->fail_entry.hardware_entry_failure_reason
> > > > +                             = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > > > +                     run->fail_entry.cpu = smp_processor_id();
> > >
> > > I just realised that this is wrong for the same reason that KVM doesn't
> > > clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> > > after the vcpu_load that set the flag and before preemption is disabled
> > > could mean that now the thread is executing on a different physical CPU
> > > than the physical CPU that caused the flag to be set. To make things worse,
> > > this CPU might even be in supported_cpus, which would be extremely
> > > confusing for someone trying to descipher what went wrong.
> > >
> > > I see three solutions here:
> > >
> > > 1. Drop setting the fail_entry.cpu field.
> > >
> > > 2. Make vcpu_put clear the flag, which means that if the flag is set here
> > > then the VCPU is definitely executing on the wrong physical CPU and
> > > smp_processor_id() will be useful.
> > >
> > > 3. Carry the unsupported CPU ID information in a new field in struct
> > > kvm_vcpu_arch.
> > >
> > > I honestly don't have a preference. Maybe slightly towards solution number
> > > 2, as it makes the code symmetrical and removes the subtletly around when
> > > the VCPU flag is cleared. But this would be done at the expense of
> > > userspace possibly finding out a lot later (or never) that something went
> > > wrong.
> > >
> > > Thoughts?
> >
> > IMHO, I would prefer 2, which is symmetrical and straightforward,
> > out of those three options.  Unless KVM checks the thread's CPU
> > affinity, userspace possibly finds that out a lot later anyway.
>
> Agreed.
>
> >
> > BTW, kvm_vcpu_pmu_restore_guest/kvm_vcpu_pmu_restore_host, which
> > are (indirectly) called from vcpu_load/vcpu_put, seems to attempt
> > to read/writes pmccfiltr_el0, which is present only when FEAT_PMUv3
> > is implemented, even if the current CPU does not support FEAT_PMUv3.
>
> I think that's a different problem, independent of this patch. There are
> other places where KVM touches the PMU registers based on
> kvm_arm_support_pmu_v3() instead of checking that the CPU has a PMU
> (__activate_traps_common() comes to mind). As far as I can tell, this
> unusual configuration works with perf because perf calls
> pmu->filter_match() before scheduling in an event, although I haven't heard
> of such a SoC existing (does not mean it doesn't exist!).

Yes, I agree that this is a different problem, and I understand
there are other codes that have the same problem.
(I just wanted to mention the problem because you might be interested
in the problem from the changes and purpose of the series)

Thank you for your comment !
Reiji
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
@ 2021-12-13  7:40           ` Reiji Watanabe
  0 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-13  7:40 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: maz, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Alex,

On Wed, Dec 8, 2021 at 2:38 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi Reiji,
>
> Thank you for the review!
>
> On Tue, Dec 07, 2021 at 11:54:51PM -0800, Reiji Watanabe wrote:
> > Hi Alex,
> >
> > On Tue, Dec 7, 2021 at 6:18 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Dec 06, 2021 at 05:02:23PM +0000, Alexandru Elisei wrote:
> > > > Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
> > > > device ioctl. If the VCPU is scheduled on a physical CPU which has a
> > > > different PMU, the perf events needed to emulate a guest PMU won't be
> > > > scheduled in and the guest performance counters will stop counting. Treat
> > > > it as an userspace error and refuse to run the VCPU in this situation.
> > > >
> > > > The VCPU is flagged as being scheduled on the wrong CPU in vcpu_load(), but
> > > > the flag is cleared when the KVM_RUN enters the non-preemptible section
> > > > instead of in vcpu_put(); this has been done on purpose so the error
> > > > condition is communicated as soon as possible to userspace, otherwise
> > > > vcpu_load() on the wrong CPU followed by a vcpu_put() would clear the flag.
> > > >
> > > > Suggested-by: Marc Zyngier <maz@kernel.org>
> > > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > ---
> > > > I agonized for hours about the best name for the VCPU flag and the
> > > > accessors. If someone has a better idea, please tell me and I'll change
> > > > them.
> > > >
> > > >  Documentation/virt/kvm/devices/vcpu.rst |  6 +++++-
> > > >  arch/arm64/include/asm/kvm_host.h       | 12 ++++++++++++
> > > >  arch/arm64/include/uapi/asm/kvm.h       |  3 +++
> > > >  arch/arm64/kvm/arm.c                    | 19 +++++++++++++++++++
> > > >  arch/arm64/kvm/pmu-emul.c               |  1 +
> > > >  5 files changed, 40 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > > > index c82be5cbc268..9ae47b7c3652 100644
> > > > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > > > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > > > @@ -128,7 +128,11 @@ systems where there are at least two CPU PMUs on the system.
> > > >
> > > >  Note that KVM will not make any attempts to run the VCPU on the physical CPUs
> > > >  associated with the PMU specified by this attribute. This is entirely left to
> > > > -userspace.
> > > > +userspace. However, attempting to run the VCPU on a physical CPU not supported
> > > > +by the PMU will fail and KVM_RUN will return with
> > > > +exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
> > > > +hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
> > > > +the cpu field to the processor id.
> > > >
> > > >  2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > > >  =================================
> > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > > index 2a5f7f38006f..0c453f2e48b6 100644
> > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > @@ -385,6 +385,8 @@ struct kvm_vcpu_arch {
> > > >               u64 last_steal;
> > > >               gpa_t base;
> > > >       } steal;
> > > > +
> > > > +     cpumask_var_t supported_cpus;
> > > >  };
> > > >
> > > >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > > > @@ -420,6 +422,7 @@ struct kvm_vcpu_arch {
> > > >  #define KVM_ARM64_EXCEPT_MASK                (7 << 9) /* Target EL/MODE */
> > > >  #define KVM_ARM64_DEBUG_STATE_SAVE_SPE       (1 << 12) /* Save SPE context if active  */
> > > >  #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE      (1 << 13) /* Save TRBE context if active  */
> > > > +#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 14) /* Physical CPU not in supported_cpus */
> > > >
> > > >  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
> > > >                                KVM_GUESTDBG_USE_SW_BP | \
> > > > @@ -460,6 +463,15 @@ struct kvm_vcpu_arch {
> > > >  #define vcpu_has_ptrauth(vcpu)               false
> > > >  #endif
> > > >
> > > > +#define vcpu_on_unsupported_cpu(vcpu)                                        \
> > > > +     ((vcpu)->arch.flags & KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > > +
> > > > +#define vcpu_set_on_unsupported_cpu(vcpu)                            \
> > > > +     ((vcpu)->arch.flags |= KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > > +
> > > > +#define vcpu_clear_on_unsupported_cpu(vcpu)                          \
> > > > +     ((vcpu)->arch.flags &= ~KVM_ARM64_ON_UNSUPPORTED_CPU)
> > > > +
> > > >  #define vcpu_gp_regs(v)              (&(v)->arch.ctxt.regs)
> > > >
> > > >  /*
> > > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > > index 1d0a0a2a9711..d49f714f48e6 100644
> > > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > > @@ -414,6 +414,9 @@ struct kvm_arm_copy_mte_tags {
> > > >  #define KVM_PSCI_RET_INVAL           PSCI_RET_INVALID_PARAMS
> > > >  #define KVM_PSCI_RET_DENIED          PSCI_RET_DENIED
> > > >
> > > > +/* run->fail_entry.hardware_entry_failure_reason codes. */
> > > > +#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED  (1ULL << 0)
> > > > +
> > > >  #endif
> > > >
> > > >  #endif /* __ARM_KVM_H__ */
> > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > index e4727dc771bf..1124c3efdd94 100644
> > > > --- a/arch/arm64/kvm/arm.c
> > > > +++ b/arch/arm64/kvm/arm.c
> > > > @@ -327,6 +327,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > > >
> > > >       vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> > > >
> > > > +     if (!zalloc_cpumask_var(&vcpu->arch.supported_cpus, GFP_KERNEL))
> > > > +             return -ENOMEM;
> >
> > It appears that vcpu->arch.supported_cpus needs to be freed
> > if kvm_arch_vcpu_create() fails after it is allocated.
> > (kvm_vgic_vcpu_init() or create_hyp_mappings() might fail)
>
> I missed that, thank you for pointing it out.
>
> >
> >
> > > > +     cpumask_copy(vcpu->arch.supported_cpus, cpu_possible_mask);
> > > > +
> > > >       /* Set up the timer */
> > > >       kvm_timer_vcpu_init(vcpu);
> > > >
> > > > @@ -354,6 +358,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > > >       if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
> > > >               static_branch_dec(&userspace_irqchip_in_use);
> > > >
> > > > +     free_cpumask_var(vcpu->arch.supported_cpus);
> > > >       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > > >       kvm_timer_vcpu_terminate(vcpu);
> > > >       kvm_pmu_vcpu_destroy(vcpu);
> > > > @@ -432,6 +437,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > > >       if (vcpu_has_ptrauth(vcpu))
> > > >               vcpu_ptrauth_disable(vcpu);
> > > >       kvm_arch_vcpu_load_debug_state_flags(vcpu);
> > > > +
> > > > +     if (!cpumask_test_cpu(smp_processor_id(), vcpu->arch.supported_cpus))
> > > > +             vcpu_set_on_unsupported_cpu(vcpu);
> > > >  }
> > > >
> > > >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > > @@ -822,6 +830,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > >                */
> > > >               preempt_disable();
> > > >
> > > > +             if (unlikely(vcpu_on_unsupported_cpu(vcpu))) {
> > > > +                     vcpu_clear_on_unsupported_cpu(vcpu);
> > > > +                     run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > > > +                     run->fail_entry.hardware_entry_failure_reason
> > > > +                             = KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED;
> > > > +                     run->fail_entry.cpu = smp_processor_id();
> > >
> > > I just realised that this is wrong for the same reason that KVM doesn't
> > > clear the unsupported CPU flag on vcpu_put: a vcpu_put/load that happened
> > > after the vcpu_load that set the flag and before preemption is disabled
> > > could mean that now the thread is executing on a different physical CPU
> > > than the physical CPU that caused the flag to be set. To make things worse,
> > > this CPU might even be in supported_cpus, which would be extremely
> > > confusing for someone trying to descipher what went wrong.
> > >
> > > I see three solutions here:
> > >
> > > 1. Drop setting the fail_entry.cpu field.
> > >
> > > 2. Make vcpu_put clear the flag, which means that if the flag is set here
> > > then the VCPU is definitely executing on the wrong physical CPU and
> > > smp_processor_id() will be useful.
> > >
> > > 3. Carry the unsupported CPU ID information in a new field in struct
> > > kvm_vcpu_arch.
> > >
> > > I honestly don't have a preference. Maybe slightly towards solution number
> > > 2, as it makes the code symmetrical and removes the subtletly around when
> > > the VCPU flag is cleared. But this would be done at the expense of
> > > userspace possibly finding out a lot later (or never) that something went
> > > wrong.
> > >
> > > Thoughts?
> >
> > IMHO, I would prefer 2, which is symmetrical and straightforward,
> > out of those three options.  Unless KVM checks the thread's CPU
> > affinity, userspace possibly finds that out a lot later anyway.
>
> Agreed.
>
> >
> > BTW, kvm_vcpu_pmu_restore_guest/kvm_vcpu_pmu_restore_host, which
> > are (indirectly) called from vcpu_load/vcpu_put, seems to attempt
> > to read/writes pmccfiltr_el0, which is present only when FEAT_PMUv3
> > is implemented, even if the current CPU does not support FEAT_PMUv3.
>
> I think that's a different problem, independent of this patch. There are
> other places where KVM touches the PMU registers based on
> kvm_arm_support_pmu_v3() instead of checking that the CPU has a PMU
> (__activate_traps_common() comes to mind). As far as I can tell, this
> unusual configuration works with perf because perf calls
> pmu->filter_match() before scheduling in an event, although I haven't heard
> of such a SoC existing (does not mean it doesn't exist!).

Yes, I agree that this is a different problem, and I understand
there are other codes that have the same problem.
(I just wanted to mention the problem because you might be interested
in the problem from the changes and purpose of the series)

Thank you for your comment !
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-13  6:36       ` Reiji Watanabe
@ 2021-12-13 11:14         ` Alexandru Elisei
  -1 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-13 11:14 UTC (permalink / raw)
  To: Reiji Watanabe; +Cc: Marc Zyngier, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Reiji,

On Sun, Dec 12, 2021 at 10:36:52PM -0800, Reiji Watanabe wrote:
> On Wed, Dec 8, 2021 at 12:05 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > Reji,
> >
> > On 2021-12-08 02:36, Reiji Watanabe wrote:
> > > Hi Alex,
> > >
> > > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > > <alexandru.elisei@arm.com> wrote:
> > >>
> > >> (CC'ing Peter Maydell in case this might be of interest to qemu)
> > >>
> > >> The series can be found on a branch at [1], and the kvmtool support at
> > >> [2].
> > >> The kvmtool patches are also on the mailing list [3] and haven't
> > >> changed
> > >> since v1.
> > >>
> > >> Detailed explanation of the issue and symptoms that the patches
> > >> attempt to
> > >> correct can be found in the cover letter for v1 [4].
> > >>
> > >> A brief summary of the problem is that on heterogeneous systems KVM
> > >> will
> > >> always use the same PMU for creating the VCPU events for *all* VCPUs
> > >> regardless of the physical CPU on which the VCPU is running, leading
> > >> to
> > >> events suddenly stopping and resuming in the guest as the VCPU thread
> > >> gets
> > >> migrated across different CPUs.
> > >>
> > >> This series proposes to fix this behaviour by allowing the user to
> > >> specify
> > >> which physical PMU is used when creating the VCPU events needed for
> > >> guest
> > >> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > >> physical which is not part of the supported CPUs for the specified
> > >> PMU.
> > >
> > > Just to confirm, this series provides an API for userspace to request
> > > KVM to detect a wrong affinity setting due to a userspace bug so that
> > > userspace can get an error at KVM_RUN instead of leading to events
> > > suddenly stopping, correct ?
> >
> > More than that, it allows userspace to select which PMU will be used
> > for their guest. The affinity setting is a byproduct of the PMU's own
> > affinity.
> 
> Thank you for the clarification.
> (I overlooked the change in kvm_pmu_create_perf_event()...)
> 
> 
> > >
> > >> The default behaviour stays the same - without userspace setting the
> > >> PMU,
> > >> events will stop counting if the VCPU is scheduled on the wrong CPU.
> > >
> > > Can't we fix the default behavior (in addition to the current fix) ?
> > > (Do we need to maintain the default behavior ??)
> >
> > Of course we do. This is a behaviour that has been exposed to userspace
> > for years, and *we don't break userspace*.
> 
> I'm wondering if it might be better to have kvm_pmu_create_perf_event()
> set attr.type to pmu_id based on the current (physical) CPU by default
> on such heterogeneous systems (even if userspace don't explicitly
> specify pmu_id with the new API).  Then, by setting the CPU affinity,
> the PMU in that environment can behave predictably even with existing
> userspace (or maybe this won't be helpful at all?).

I think then you would end up with the possible mismatch between
kvm->arch.pmuver and the version of the PMU that is used for creating the
events.

Also, as VCPUs get migrated from one physical CPU to the other, the
semantics of the microarchitectural events change, even if the event ID is
the same.

Thanks,
Alex

> 
> Thanks,
> Reiji
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-13 11:14         ` Alexandru Elisei
  0 siblings, 0 replies; 52+ messages in thread
From: Alexandru Elisei @ 2021-12-13 11:14 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Marc Zyngier, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Reiji,

On Sun, Dec 12, 2021 at 10:36:52PM -0800, Reiji Watanabe wrote:
> On Wed, Dec 8, 2021 at 12:05 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > Reji,
> >
> > On 2021-12-08 02:36, Reiji Watanabe wrote:
> > > Hi Alex,
> > >
> > > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > > <alexandru.elisei@arm.com> wrote:
> > >>
> > >> (CC'ing Peter Maydell in case this might be of interest to qemu)
> > >>
> > >> The series can be found on a branch at [1], and the kvmtool support at
> > >> [2].
> > >> The kvmtool patches are also on the mailing list [3] and haven't
> > >> changed
> > >> since v1.
> > >>
> > >> Detailed explanation of the issue and symptoms that the patches
> > >> attempt to
> > >> correct can be found in the cover letter for v1 [4].
> > >>
> > >> A brief summary of the problem is that on heterogeneous systems KVM
> > >> will
> > >> always use the same PMU for creating the VCPU events for *all* VCPUs
> > >> regardless of the physical CPU on which the VCPU is running, leading
> > >> to
> > >> events suddenly stopping and resuming in the guest as the VCPU thread
> > >> gets
> > >> migrated across different CPUs.
> > >>
> > >> This series proposes to fix this behaviour by allowing the user to
> > >> specify
> > >> which physical PMU is used when creating the VCPU events needed for
> > >> guest
> > >> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > >> physical which is not part of the supported CPUs for the specified
> > >> PMU.
> > >
> > > Just to confirm, this series provides an API for userspace to request
> > > KVM to detect a wrong affinity setting due to a userspace bug so that
> > > userspace can get an error at KVM_RUN instead of leading to events
> > > suddenly stopping, correct ?
> >
> > More than that, it allows userspace to select which PMU will be used
> > for their guest. The affinity setting is a byproduct of the PMU's own
> > affinity.
> 
> Thank you for the clarification.
> (I overlooked the change in kvm_pmu_create_perf_event()...)
> 
> 
> > >
> > >> The default behaviour stays the same - without userspace setting the
> > >> PMU,
> > >> events will stop counting if the VCPU is scheduled on the wrong CPU.
> > >
> > > Can't we fix the default behavior (in addition to the current fix) ?
> > > (Do we need to maintain the default behavior ??)
> >
> > Of course we do. This is a behaviour that has been exposed to userspace
> > for years, and *we don't break userspace*.
> 
> I'm wondering if it might be better to have kvm_pmu_create_perf_event()
> set attr.type to pmu_id based on the current (physical) CPU by default
> on such heterogeneous systems (even if userspace don't explicitly
> specify pmu_id with the new API).  Then, by setting the CPU affinity,
> the PMU in that environment can behave predictably even with existing
> userspace (or maybe this won't be helpful at all?).

I think then you would end up with the possible mismatch between
kvm->arch.pmuver and the version of the PMU that is used for creating the
events.

Also, as VCPUs get migrated from one physical CPU to the other, the
semantics of the microarchitectural events change, even if the event ID is
the same.

Thanks,
Alex

> 
> Thanks,
> Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-13 11:14         ` Alexandru Elisei
@ 2021-12-14  6:24           ` Reiji Watanabe
  -1 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-14  6:24 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Marc Zyngier, mingo, tglx, will, kvmarm, linux-arm-kernel

Hi Alex,

On Mon, Dec 13, 2021 at 3:14 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi Reiji,
>
> On Sun, Dec 12, 2021 at 10:36:52PM -0800, Reiji Watanabe wrote:
> > On Wed, Dec 8, 2021 at 12:05 AM Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > Reji,
> > >
> > > On 2021-12-08 02:36, Reiji Watanabe wrote:
> > > > Hi Alex,
> > > >
> > > > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > > > <alexandru.elisei@arm.com> wrote:
> > > >>
> > > >> (CC'ing Peter Maydell in case this might be of interest to qemu)
> > > >>
> > > >> The series can be found on a branch at [1], and the kvmtool support at
> > > >> [2].
> > > >> The kvmtool patches are also on the mailing list [3] and haven't
> > > >> changed
> > > >> since v1.
> > > >>
> > > >> Detailed explanation of the issue and symptoms that the patches
> > > >> attempt to
> > > >> correct can be found in the cover letter for v1 [4].
> > > >>
> > > >> A brief summary of the problem is that on heterogeneous systems KVM
> > > >> will
> > > >> always use the same PMU for creating the VCPU events for *all* VCPUs
> > > >> regardless of the physical CPU on which the VCPU is running, leading
> > > >> to
> > > >> events suddenly stopping and resuming in the guest as the VCPU thread
> > > >> gets
> > > >> migrated across different CPUs.
> > > >>
> > > >> This series proposes to fix this behaviour by allowing the user to
> > > >> specify
> > > >> which physical PMU is used when creating the VCPU events needed for
> > > >> guest
> > > >> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > > >> physical which is not part of the supported CPUs for the specified
> > > >> PMU.
> > > >
> > > > Just to confirm, this series provides an API for userspace to request
> > > > KVM to detect a wrong affinity setting due to a userspace bug so that
> > > > userspace can get an error at KVM_RUN instead of leading to events
> > > > suddenly stopping, correct ?
> > >
> > > More than that, it allows userspace to select which PMU will be used
> > > for their guest. The affinity setting is a byproduct of the PMU's own
> > > affinity.
> >
> > Thank you for the clarification.
> > (I overlooked the change in kvm_pmu_create_perf_event()...)
> >
> >
> > > >
> > > >> The default behaviour stays the same - without userspace setting the
> > > >> PMU,
> > > >> events will stop counting if the VCPU is scheduled on the wrong CPU.
> > > >
> > > > Can't we fix the default behavior (in addition to the current fix) ?
> > > > (Do we need to maintain the default behavior ??)
> > >
> > > Of course we do. This is a behaviour that has been exposed to userspace
> > > for years, and *we don't break userspace*.
> >
> > I'm wondering if it might be better to have kvm_pmu_create_perf_event()
> > set attr.type to pmu_id based on the current (physical) CPU by default
> > on such heterogeneous systems (even if userspace don't explicitly
> > specify pmu_id with the new API).  Then, by setting the CPU affinity,
> > the PMU in that environment can behave predictably even with existing
> > userspace (or maybe this won't be helpful at all?).
>
> I think then you would end up with the possible mismatch between
> kvm->arch.pmuver and the version of the PMU that is used for creating the
> events.

Yes, but, I would think we can have kvm_pmu_create_perf_event()
set vcpu->arch.pmu.arm_pmu based on the current physical CPU
when vcpu->arch.pmu.arm_pmu is null (then, the pmuver is handled
as if KVM_ARM_VCPU_PMU_V3_SET_PMU was done implicitly).


> Also, as VCPUs get migrated from one physical CPU to the other, the
> semantics of the microarchitectural events change, even if the event ID is
> the same.

Yes, I understand.  As mentioned, this can work only when the
CPU affinity is set for vCPU threads appropriately (, which could
be done even without changing userspace).

Thanks,
Reiji
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-14  6:24           ` Reiji Watanabe
  0 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-14  6:24 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Marc Zyngier, james.morse, suzuki.poulose, will, mark.rutland,
	linux-arm-kernel, kvmarm, tglx, mingo

Hi Alex,

On Mon, Dec 13, 2021 at 3:14 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi Reiji,
>
> On Sun, Dec 12, 2021 at 10:36:52PM -0800, Reiji Watanabe wrote:
> > On Wed, Dec 8, 2021 at 12:05 AM Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > Reji,
> > >
> > > On 2021-12-08 02:36, Reiji Watanabe wrote:
> > > > Hi Alex,
> > > >
> > > > On Mon, Dec 6, 2021 at 9:02 AM Alexandru Elisei
> > > > <alexandru.elisei@arm.com> wrote:
> > > >>
> > > >> (CC'ing Peter Maydell in case this might be of interest to qemu)
> > > >>
> > > >> The series can be found on a branch at [1], and the kvmtool support at
> > > >> [2].
> > > >> The kvmtool patches are also on the mailing list [3] and haven't
> > > >> changed
> > > >> since v1.
> > > >>
> > > >> Detailed explanation of the issue and symptoms that the patches
> > > >> attempt to
> > > >> correct can be found in the cover letter for v1 [4].
> > > >>
> > > >> A brief summary of the problem is that on heterogeneous systems KVM
> > > >> will
> > > >> always use the same PMU for creating the VCPU events for *all* VCPUs
> > > >> regardless of the physical CPU on which the VCPU is running, leading
> > > >> to
> > > >> events suddenly stopping and resuming in the guest as the VCPU thread
> > > >> gets
> > > >> migrated across different CPUs.
> > > >>
> > > >> This series proposes to fix this behaviour by allowing the user to
> > > >> specify
> > > >> which physical PMU is used when creating the VCPU events needed for
> > > >> guest
> > > >> PMU emulation. When the PMU is set, KVM will refuse to the VCPU on a
> > > >> physical which is not part of the supported CPUs for the specified
> > > >> PMU.
> > > >
> > > > Just to confirm, this series provides an API for userspace to request
> > > > KVM to detect a wrong affinity setting due to a userspace bug so that
> > > > userspace can get an error at KVM_RUN instead of leading to events
> > > > suddenly stopping, correct ?
> > >
> > > More than that, it allows userspace to select which PMU will be used
> > > for their guest. The affinity setting is a byproduct of the PMU's own
> > > affinity.
> >
> > Thank you for the clarification.
> > (I overlooked the change in kvm_pmu_create_perf_event()...)
> >
> >
> > > >
> > > >> The default behaviour stays the same - without userspace setting the
> > > >> PMU,
> > > >> events will stop counting if the VCPU is scheduled on the wrong CPU.
> > > >
> > > > Can't we fix the default behavior (in addition to the current fix) ?
> > > > (Do we need to maintain the default behavior ??)
> > >
> > > Of course we do. This is a behaviour that has been exposed to userspace
> > > for years, and *we don't break userspace*.
> >
> > I'm wondering if it might be better to have kvm_pmu_create_perf_event()
> > set attr.type to pmu_id based on the current (physical) CPU by default
> > on such heterogeneous systems (even if userspace don't explicitly
> > specify pmu_id with the new API).  Then, by setting the CPU affinity,
> > the PMU in that environment can behave predictably even with existing
> > userspace (or maybe this won't be helpful at all?).
>
> I think then you would end up with the possible mismatch between
> kvm->arch.pmuver and the version of the PMU that is used for creating the
> events.

Yes, but, I would think we can have kvm_pmu_create_perf_event()
set vcpu->arch.pmu.arm_pmu based on the current physical CPU
when vcpu->arch.pmu.arm_pmu is null (then, the pmuver is handled
as if KVM_ARM_VCPU_PMU_V3_SET_PMU was done implicitly).


> Also, as VCPUs get migrated from one physical CPU to the other, the
> semantics of the microarchitectural events change, even if the event ID is
> the same.

Yes, I understand.  As mentioned, this can work only when the
CPU affinity is set for vCPU threads appropriately (, which could
be done even without changing userspace).

Thanks,
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-14  6:24           ` Reiji Watanabe
@ 2021-12-14 11:56             ` Marc Zyngier
  -1 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-14 11:56 UTC (permalink / raw)
  To: Reiji Watanabe; +Cc: will, mingo, tglx, kvmarm, linux-arm-kernel

On Tue, 14 Dec 2021 06:24:38 +0000,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Hi Alex,
> 
> On Mon, Dec 13, 2021 at 3:14 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
> >
> > Also, as VCPUs get migrated from one physical CPU to the other, the
> > semantics of the microarchitectural events change, even if the event ID is
> > the same.
> 
> Yes, I understand.  As mentioned, this can work only when the
> CPU affinity is set for vCPU threads appropriately (, which could
> be done even without changing userspace).

Implicit bindings to random PMUs based on the scheduling seems a
pretty fragile API to me, and presents no real incentive for userspace
to start doing the right thing.

I'd prefer not counting events at all when on the wrong CPU (for some
definition of 'wrong'), rather than accumulating unrelated events.
Both are admittedly wrong, but between two evils, I'd rather stick
with the one I know (and that doesn't require any change)...

Alex's series brings a way to solve this by allowing userspace to pick
a PMU and make sure userspace is aware of the consequences. It puts
userspace in charge, and doesn't leave space for ambiguous behaviours.

I definitely find value in this approach.

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-14 11:56             ` Marc Zyngier
  0 siblings, 0 replies; 52+ messages in thread
From: Marc Zyngier @ 2021-12-14 11:56 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Alexandru Elisei, james.morse, suzuki.poulose, will,
	mark.rutland, linux-arm-kernel, kvmarm, tglx, mingo

On Tue, 14 Dec 2021 06:24:38 +0000,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Hi Alex,
> 
> On Mon, Dec 13, 2021 at 3:14 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
> >
> > Also, as VCPUs get migrated from one physical CPU to the other, the
> > semantics of the microarchitectural events change, even if the event ID is
> > the same.
> 
> Yes, I understand.  As mentioned, this can work only when the
> CPU affinity is set for vCPU threads appropriately (, which could
> be done even without changing userspace).

Implicit bindings to random PMUs based on the scheduling seems a
pretty fragile API to me, and presents no real incentive for userspace
to start doing the right thing.

I'd prefer not counting events at all when on the wrong CPU (for some
definition of 'wrong'), rather than accumulating unrelated events.
Both are admittedly wrong, but between two evils, I'd rather stick
with the one I know (and that doesn't require any change)...

Alex's series brings a way to solve this by allowing userspace to pick
a PMU and make sure userspace is aware of the consequences. It puts
userspace in charge, and doesn't leave space for ambiguous behaviours.

I definitely find value in this approach.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
  2021-12-14 11:56             ` Marc Zyngier
@ 2021-12-15  6:47               ` Reiji Watanabe
  -1 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-15  6:47 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: will, mingo, tglx, kvmarm, linux-arm-kernel

On Tue, Dec 14, 2021 at 3:57 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 14 Dec 2021 06:24:38 +0000,
> Reiji Watanabe <reijiw@google.com> wrote:
> >
> > Hi Alex,
> >
> > On Mon, Dec 13, 2021 at 3:14 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> > >
> > > Also, as VCPUs get migrated from one physical CPU to the other, the
> > > semantics of the microarchitectural events change, even if the event ID is
> > > the same.
> >
> > Yes, I understand.  As mentioned, this can work only when the
> > CPU affinity is set for vCPU threads appropriately (, which could
> > be done even without changing userspace).
>
> Implicit bindings to random PMUs based on the scheduling seems a
> pretty fragile API to me,

Yes, I understand that. I was just looking into the possibility
of improving the default behavior in some way rather than keeping
the unpredictable default behavior.

> and presents no real incentive for userspace
> to start doing the right thing.

I see... It makes sense.
I didn't think about that aspect.

> I'd prefer not counting events at all when on the wrong CPU (for some
> definition of 'wrong'), rather than accumulating unrelated events.
> Both are admittedly wrong, but between two evils, I'd rather stick
> with the one I know (and that doesn't require any change)...
>
> Alex's series brings a way to solve this by allowing userspace to pick
> a PMU and make sure userspace is aware of the consequences. It puts
> userspace in charge, and doesn't leave space for ambiguous behaviours.
>
> I definitely find value in this approach.

Yes, I agree with that.
It wasn't meant to replace Alex's approach.  It was only about the
default behavior (i.e. when userspace does not specify a PMUID with
the new API).

Anyway, thank you so much for sharing your thoughts on it !

Regards,
Reiji
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems
@ 2021-12-15  6:47               ` Reiji Watanabe
  0 siblings, 0 replies; 52+ messages in thread
From: Reiji Watanabe @ 2021-12-15  6:47 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Alexandru Elisei, james.morse, suzuki.poulose, will,
	mark.rutland, linux-arm-kernel, kvmarm, tglx, mingo

On Tue, Dec 14, 2021 at 3:57 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 14 Dec 2021 06:24:38 +0000,
> Reiji Watanabe <reijiw@google.com> wrote:
> >
> > Hi Alex,
> >
> > On Mon, Dec 13, 2021 at 3:14 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> > >
> > > Also, as VCPUs get migrated from one physical CPU to the other, the
> > > semantics of the microarchitectural events change, even if the event ID is
> > > the same.
> >
> > Yes, I understand.  As mentioned, this can work only when the
> > CPU affinity is set for vCPU threads appropriately (, which could
> > be done even without changing userspace).
>
> Implicit bindings to random PMUs based on the scheduling seems a
> pretty fragile API to me,

Yes, I understand that. I was just looking into the possibility
of improving the default behavior in some way rather than keeping
the unpredictable default behavior.

> and presents no real incentive for userspace
> to start doing the right thing.

I see... It makes sense.
I didn't think about that aspect.

> I'd prefer not counting events at all when on the wrong CPU (for some
> definition of 'wrong'), rather than accumulating unrelated events.
> Both are admittedly wrong, but between two evils, I'd rather stick
> with the one I know (and that doesn't require any change)...
>
> Alex's series brings a way to solve this by allowing userspace to pick
> a PMU and make sure userspace is aware of the consequences. It puts
> userspace in charge, and doesn't leave space for ambiguous behaviours.
>
> I definitely find value in this approach.

Yes, I agree with that.
It wasn't meant to replace Alex's approach.  It was only about the
default behavior (i.e. when userspace does not specify a PMUID with
the new API).

Anyway, thank you so much for sharing your thoughts on it !

Regards,
Reiji

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2021-12-15  6:57 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 17:02 [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems Alexandru Elisei
2021-12-06 17:02 ` Alexandru Elisei
2021-12-06 17:02 ` [PATCH v2 1/4] perf: Fix wrong name in comment for struct perf_cpu_context Alexandru Elisei
2021-12-06 17:02   ` Alexandru Elisei
2021-12-06 17:02 ` [PATCH v2 2/4] KVM: arm64: Keep a list of probed PMUs Alexandru Elisei
2021-12-06 17:02   ` Alexandru Elisei
2021-12-06 17:02 ` [PATCH v2 3/4] KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute Alexandru Elisei
2021-12-06 17:02   ` Alexandru Elisei
2021-12-08  3:13   ` Reiji Watanabe
2021-12-08  3:13     ` Reiji Watanabe
2021-12-08 12:23     ` Alexandru Elisei
2021-12-08 12:23       ` Alexandru Elisei
2021-12-08 12:43       ` Alexandru Elisei
2021-12-08 12:43         ` Alexandru Elisei
2021-12-08 14:25       ` Marc Zyngier
2021-12-08 14:25         ` Marc Zyngier
2021-12-08 15:20         ` Alexandru Elisei
2021-12-08 15:20           ` Alexandru Elisei
2021-12-08 15:44           ` Marc Zyngier
2021-12-08 15:44             ` Marc Zyngier
2021-12-08 16:11             ` Alexandru Elisei
2021-12-08 16:11               ` Alexandru Elisei
2021-12-08 16:21               ` Marc Zyngier
2021-12-08 16:21                 ` Marc Zyngier
2021-12-06 17:02 ` [PATCH v2 4/4] KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU Alexandru Elisei
2021-12-06 17:02   ` Alexandru Elisei
2021-12-07 14:17   ` Alexandru Elisei
2021-12-07 14:17     ` Alexandru Elisei
2021-12-08  7:54     ` Reiji Watanabe
2021-12-08  7:54       ` Reiji Watanabe
2021-12-08 10:38       ` Alexandru Elisei
2021-12-08 10:38         ` Alexandru Elisei
2021-12-13  7:40         ` Reiji Watanabe
2021-12-13  7:40           ` Reiji Watanabe
2021-12-08  9:56     ` Marc Zyngier
2021-12-08  9:56       ` Marc Zyngier
2021-12-08 11:18       ` Alexandru Elisei
2021-12-08 11:18         ` Alexandru Elisei
2021-12-08  2:36 ` [PATCH v2 0/4] KVM: arm64: Improve PMU support on heterogeneous systems Reiji Watanabe
2021-12-08  2:36   ` Reiji Watanabe
2021-12-08  8:05   ` Marc Zyngier
2021-12-08  8:05     ` Marc Zyngier
2021-12-13  6:36     ` Reiji Watanabe
2021-12-13  6:36       ` Reiji Watanabe
2021-12-13 11:14       ` Alexandru Elisei
2021-12-13 11:14         ` Alexandru Elisei
2021-12-14  6:24         ` Reiji Watanabe
2021-12-14  6:24           ` Reiji Watanabe
2021-12-14 11:56           ` Marc Zyngier
2021-12-14 11:56             ` Marc Zyngier
2021-12-15  6:47             ` Reiji Watanabe
2021-12-15  6:47               ` Reiji Watanabe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.