linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/5] arm64: Support perf event modifiers :G and :H
@ 2019-01-14 16:11 Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Andrew Murray @ 2019-01-14 16:11 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Julien Thierry, kvmarm, linux-arm-kernel, Suzuki K Poulose

This patchset provides support for perf event modifiers :G and :H which
allows for filtering of PMU events between host and guests when used
with KVM.

As the underlying hardware cannot distinguish between guest and host
context, the performance counters must be stopped and started upon
entry/exit to the guest. This is performed at EL2 in a way that
minimizes overhead and improves accuracy of recording events that only
occur in the requested context.

This has been tested with VHE and non-VHE kernels with a KVM guest.

Changes from v9:

 - Rebased to v5.0-rc2
 - Ensure get_host_ctxt considers host_ctxt offset

Changes from v8:

 - Added additional comments
 - Fixed bisect build failure
 - Renamed cpu_ctxt variable to cpu_data

Changes from v7:

 - Added additional patch to encapsulate kvm_cpu_context in kvm_host_data

Changes from v6:

 - Move events_host/events_guest out of kvm_cpu_context

Changes from v5:

 - Tweak logic in use of kvm_set_pmu_events

Changes from v4:

 - Prevent unnecessary write_sysreg calls by improving
   __pmu_switch_to_xxx logic.

Changes from v3:

 - Remove confusing _only suffix from bitfields in kvm_cpu_context
 - Remove unnecessary condition when clearing event bits in disable
 - Simplify API of KVM accessors
 - Prevent unnecessary setting of pmcnten when guest/host events are
   the same.

Changes from v2:

 - Ensured that exclude_kernel works for guest
 - Removed unnecessary exclusion of EL2 with exclude_host on !VHE
 - Renamed kvm_clr_set_host_pmu_events to reflect args order
 - Added additional information to isb patch

Changes from v1:

 - Removed unnecessary exclusion of EL1 with exclude_guest on VHE
 - Removed unnecessary isb from existing perf_event.c driver
 - Folded perf_event.c patches together
 - Added additional information to last patch commit message

Andrew Murray (5):
  arm64: arm_pmu: remove unnecessary isb instruction
  arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  arm64: KVM: add accessors to track guest/host only counters
  arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  arm64: KVM: Enable support for :G/:H perf event modifiers

 arch/arm/include/asm/kvm_host.h   |  8 ++++--
 arch/arm64/include/asm/kvm_asm.h  |  3 +-
 arch/arm64/include/asm/kvm_host.h | 42 +++++++++++++++++++++++----
 arch/arm64/kernel/asm-offsets.c   |  1 +
 arch/arm64/kernel/perf_event.c    | 54 +++++++++++++++++++++++++++++------
 arch/arm64/kvm/hyp/switch.c       | 60 +++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/arm.c                | 12 ++++----
 7 files changed, 159 insertions(+), 21 deletions(-)

-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v10 1/5] arm64: arm_pmu: remove unnecessary isb instruction
  2019-01-14 16:11 [PATCH v10 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
@ 2019-01-14 16:11 ` Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Andrew Murray @ 2019-01-14 16:11 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Julien Thierry, kvmarm, linux-arm-kernel, Suzuki K Poulose

The armv8pmu_enable_event_counter function issues an isb instruction
after enabling a pair of counters - this doesn't provide any value
and is inconsistent with the armv8pmu_disable_event_counter.

In any case armv8pmu_enable_event_counter is always called with the
PMU stopped. Starting the PMU with armv8pmu_start results in an isb
instruction being issued prior to writing to PMCR_EL0.

Let's remove the unnecessary isb instruction.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/kernel/perf_event.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 1620a37..1c71796 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -533,7 +533,6 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
 	armv8pmu_enable_counter(idx);
 	if (armv8pmu_event_is_chained(event))
 		armv8pmu_enable_counter(idx - 1);
-	isb();
 }
 
 static inline int armv8pmu_disable_counter(int idx)
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v10 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  2019-01-14 16:11 [PATCH v10 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
@ 2019-01-14 16:11 ` Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Andrew Murray @ 2019-01-14 16:11 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Julien Thierry, kvmarm, linux-arm-kernel, Suzuki K Poulose

The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
a typedef to kvm_cpu_context and is used to store host cpu context. The
kvm_cpu_context structure is also used elsewhere to hold vcpu context.
In order to use the percpu to hold additional future host information we
encapsulate kvm_cpu_context in a new structure and rename the typedef and
percpu to match.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |  8 ++++++--
 arch/arm64/include/asm/kvm_asm.h  |  3 ++-
 arch/arm64/include/asm/kvm_host.h | 14 +++++++++-----
 arch/arm64/kernel/asm-offsets.c   |  1 +
 virt/kvm/arm/arm.c                | 12 +++++++-----
 5 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index ca56537..6420b40 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -145,7 +145,11 @@ struct kvm_cpu_context {
 	u32 cp15[NR_CP15_REGS];
 };
 
-typedef struct kvm_cpu_context kvm_cpu_context_t;
+struct kvm_host_data {
+	struct kvm_cpu_context host_ctxt;
+};
+
+typedef struct kvm_host_data kvm_host_data_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
@@ -163,7 +167,7 @@ struct kvm_vcpu_arch {
 	struct kvm_vcpu_fault_info fault;
 
 	/* Host FP context */
-	kvm_cpu_context_t *host_cpu_context;
+	struct kvm_cpu_context *host_cpu_context;
 
 	/* VGIC state */
 	struct vgic_cpu vgic_cpu;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index f5b79e9..ff73f54 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -108,7 +108,8 @@ extern u32 __kvm_get_mdcr_el2(void);
 .endm
 
 .macro get_host_ctxt reg, tmp
-	hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp
+	hyp_adr_this_cpu \reg, kvm_host_data, \tmp
+	add	\reg, \reg, #HOST_DATA_CONTEXT
 .endm
 
 .macro get_vcpu_ptr vcpu, ctxt
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7732d0b..26fd935 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -206,7 +206,11 @@ struct kvm_cpu_context {
 	struct kvm_vcpu *__hyp_running_vcpu;
 };
 
-typedef struct kvm_cpu_context kvm_cpu_context_t;
+struct kvm_host_data {
+	struct kvm_cpu_context host_ctxt;
+};
+
+typedef struct kvm_host_data kvm_host_data_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
@@ -242,7 +246,7 @@ struct kvm_vcpu_arch {
 	struct kvm_guest_debug_arch external_debug_state;
 
 	/* Pointer to host CPU context */
-	kvm_cpu_context_t *host_cpu_context;
+	struct kvm_cpu_context *host_cpu_context;
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
@@ -387,7 +391,7 @@ void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
+DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
 
 void __kvm_enable_ssbs(void);
 
@@ -400,8 +404,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 	 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 	 * so that we can use adr_l to access per-cpu variables in EL2.
 	 */
-	u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_cpu_state) -
-			 (u64)kvm_ksym_ref(kvm_host_cpu_state));
+	u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_data) -
+			 (u64)kvm_ksym_ref(kvm_host_data));
 
 	/*
 	 * Call initialization code, and switch to the full blown HYP code.
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 65b8afc..9b03554 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -146,6 +146,7 @@ int main(void)
   DEFINE(VCPU_FPEXC32_EL2,	offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2]));
   DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));
   DEFINE(HOST_CONTEXT_VCPU,	offsetof(struct kvm_cpu_context, __hyp_running_vcpu));
+  DEFINE(HOST_DATA_CONTEXT,	offsetof(struct kvm_host_data, host_ctxt));
 #endif
 #ifdef CONFIG_CPU_PM
   DEFINE(CPU_SUSPEND_SZ,	sizeof(struct cpu_suspend_ctx));
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 9e350fd3..89acb7f 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -56,7 +56,7 @@
 __asm__(".arch_extension	virt");
 #endif
 
-DEFINE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
+DEFINE_PER_CPU(kvm_host_data_t, kvm_host_data);
 static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
 
 /* Per-CPU variable containing the currently running vcpu. */
@@ -363,8 +363,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
 	int *last_ran;
+	kvm_host_data_t *cpu_data;
 
 	last_ran = this_cpu_ptr(vcpu->kvm->arch.last_vcpu_ran);
+	cpu_data = this_cpu_ptr(&kvm_host_data);
 
 	/*
 	 * We might get preempted before the vCPU actually runs, but
@@ -376,7 +378,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	}
 
 	vcpu->cpu = cpu;
-	vcpu->arch.host_cpu_context = this_cpu_ptr(&kvm_host_cpu_state);
+	vcpu->arch.host_cpu_context = &cpu_data->host_ctxt;
 
 	kvm_arm_set_running_vcpu(vcpu);
 	kvm_vgic_load(vcpu);
@@ -1558,10 +1560,10 @@ static int init_hyp_mode(void)
 	}
 
 	for_each_possible_cpu(cpu) {
-		kvm_cpu_context_t *cpu_ctxt;
+		kvm_host_data_t *cpu_data;
 
-		cpu_ctxt = per_cpu_ptr(&kvm_host_cpu_state, cpu);
-		err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1, PAGE_HYP);
+		cpu_data = per_cpu_ptr(&kvm_host_data, cpu);
+		err = create_hyp_mappings(cpu_data, cpu_data + 1, PAGE_HYP);
 
 		if (err) {
 			kvm_err("Cannot map host CPU state: %d\n", err);
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v10 3/5] arm64: KVM: add accessors to track guest/host only counters
  2019-01-14 16:11 [PATCH v10 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
@ 2019-01-14 16:11 ` Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
  2019-01-14 16:11 ` [PATCH v10 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
  4 siblings, 0 replies; 15+ messages in thread
From: Andrew Murray @ 2019-01-14 16:11 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Julien Thierry, kvmarm, linux-arm-kernel, Suzuki K Poulose

In order to effeciently enable/disable guest/host only perf counters
at guest entry/exit we add bitfields to kvm_cpu_context for guest and
host events as well as accessors for updating them.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 26fd935..cbfe3d1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -206,8 +206,14 @@ struct kvm_cpu_context {
 	struct kvm_vcpu *__hyp_running_vcpu;
 };
 
+struct kvm_pmu_events {
+	u32 events_host;
+	u32 events_guest;
+};
+
 struct kvm_host_data {
 	struct kvm_cpu_context host_ctxt;
+	struct kvm_pmu_events pmu_events;
 };
 
 typedef struct kvm_host_data kvm_host_data_t;
@@ -468,11 +474,33 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
 
+#define KVM_PMU_EVENTS_HOST	1
+#define KVM_PMU_EVENTS_GUEST	2
+
 #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
 static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 {
 	return kvm_arch_vcpu_run_map_fp(vcpu);
 }
+static inline void kvm_set_pmu_events(u32 set, int flags)
+{
+	struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+
+	if (flags & KVM_PMU_EVENTS_HOST)
+		ctx->pmu_events.events_host |= set;
+	if (flags & KVM_PMU_EVENTS_GUEST)
+		ctx->pmu_events.events_guest |= set;
+}
+static inline void kvm_clr_pmu_events(u32 clr)
+{
+	struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+
+	ctx->pmu_events.events_host &= ~clr;
+	ctx->pmu_events.events_guest &= ~clr;
+}
+#else
+static inline void kvm_set_pmu_events(u32 set, int flags) {}
+static inline void kvm_clr_pmu_events(u32 clr) {}
 #endif
 
 static inline void kvm_arm_vhe_guest_enter(void)
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-14 16:11 [PATCH v10 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
                   ` (2 preceding siblings ...)
  2019-01-14 16:11 ` [PATCH v10 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
@ 2019-01-14 16:11 ` Andrew Murray
  2019-02-11 11:26   ` Will Deacon
  2019-02-18 21:53   ` Christoffer Dall
  2019-01-14 16:11 ` [PATCH v10 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
  4 siblings, 2 replies; 15+ messages in thread
From: Andrew Murray @ 2019-01-14 16:11 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Julien Thierry, kvmarm, linux-arm-kernel, Suzuki K Poulose

Add support for the :G and :H attributes in perf by handling the
exclude_host/exclude_guest event attributes.

We notify KVM of counters that we wish to be enabled or disabled on
guest entry/exit and thus defer from starting or stopping :G events
as per the events exclude_host attribute.

With both VHE and non-VHE we switch the counters between host/guest
at EL2. We are able to eliminate counters counting host events on
the boundaries of guest entry/exit when using :G by filtering out
EL2 for exclude_host. However when using :H unless exclude_hv is set
on non-VHE then there is a small blackout window at the guest
entry/exit where host events are not captured.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/perf_event.c | 53 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 46 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 1c71796..21c6831 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -26,6 +26,7 @@
 
 #include <linux/acpi.h>
 #include <linux/clocksource.h>
+#include <linux/kvm_host.h>
 #include <linux/of.h>
 #include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
@@ -528,11 +529,27 @@ static inline int armv8pmu_enable_counter(int idx)
 
 static inline void armv8pmu_enable_event_counter(struct perf_event *event)
 {
+	struct perf_event_attr *attr = &event->attr;
 	int idx = event->hw.idx;
+	int flags = 0;
+	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
-	armv8pmu_enable_counter(idx);
 	if (armv8pmu_event_is_chained(event))
-		armv8pmu_enable_counter(idx - 1);
+		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+	if (!attr->exclude_host)
+		flags |= KVM_PMU_EVENTS_HOST;
+	if (!attr->exclude_guest)
+		flags |= KVM_PMU_EVENTS_GUEST;
+
+	kvm_set_pmu_events(counter_bits, flags);
+
+	/* We rely on the hypervisor switch code to enable guest counters */
+	if (!attr->exclude_host) {
+		armv8pmu_enable_counter(idx);
+		if (armv8pmu_event_is_chained(event))
+			armv8pmu_enable_counter(idx - 1);
+	}
 }
 
 static inline int armv8pmu_disable_counter(int idx)
@@ -545,11 +562,21 @@ static inline int armv8pmu_disable_counter(int idx)
 static inline void armv8pmu_disable_event_counter(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event_attr *attr = &event->attr;
 	int idx = hwc->idx;
+	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
 	if (armv8pmu_event_is_chained(event))
-		armv8pmu_disable_counter(idx - 1);
-	armv8pmu_disable_counter(idx);
+		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+	kvm_clr_pmu_events(counter_bits);
+
+	/* We rely on the hypervisor switch code to disable guest counters */
+	if (!attr->exclude_host) {
+		if (armv8pmu_event_is_chained(event))
+			armv8pmu_disable_counter(idx - 1);
+		armv8pmu_disable_counter(idx);
+	}
 }
 
 static inline int armv8pmu_enable_intens(int idx)
@@ -824,16 +851,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
 	 * Therefore we ignore exclude_hv in this configuration, since
 	 * there's no hypervisor to sample anyway. This is consistent
 	 * with other architectures (x86 and Power).
+	 *
+	 * To eliminate counting host events on the boundaries of
+	 * guest entry/exit we ensure EL2 is not included in hyp mode
+	 * with !exclude_host.
 	 */
 	if (is_kernel_in_hyp_mode()) {
-		if (!attr->exclude_kernel)
+		if (!attr->exclude_kernel && !attr->exclude_host)
 			config_base |= ARMV8_PMU_INCLUDE_EL2;
 	} else {
-		if (attr->exclude_kernel)
-			config_base |= ARMV8_PMU_EXCLUDE_EL1;
 		if (!attr->exclude_hv)
 			config_base |= ARMV8_PMU_INCLUDE_EL2;
 	}
+
+	/*
+	 * Filter out !VHE kernels and guest kernels
+	 */
+	if (attr->exclude_kernel)
+		config_base |= ARMV8_PMU_EXCLUDE_EL1;
+
 	if (attr->exclude_user)
 		config_base |= ARMV8_PMU_EXCLUDE_EL0;
 
@@ -863,6 +899,9 @@ static void armv8pmu_reset(void *info)
 		armv8pmu_disable_intens(idx);
 	}
 
+	/* Clear the counters we flip at guest entry/exit */
+	kvm_clr_pmu_events(U32_MAX);
+
 	/*
 	 * Initialize & Reset PMNC. Request overflow interrupt for
 	 * 64 bit cycle counter but cheat in armv8pmu_write_counter().
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v10 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers
  2019-01-14 16:11 [PATCH v10 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
                   ` (3 preceding siblings ...)
  2019-01-14 16:11 ` [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
@ 2019-01-14 16:11 ` Andrew Murray
  2019-02-18 22:00   ` Christoffer Dall
  4 siblings, 1 reply; 15+ messages in thread
From: Andrew Murray @ 2019-01-14 16:11 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Julien Thierry, kvmarm, linux-arm-kernel, Suzuki K Poulose

Enable/disable event counters as appropriate when entering and exiting
the guest to enable support for guest or host only event counting.

For both VHE and non-VHE we switch the counters between host/guest at
EL2. EL2 is filtered out by the PMU when we are using the :G modifier.

The PMU may be on when we change which counters are enabled however
we avoid adding an isb as we instead rely on existing context
synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and
the eret from the hvc in kvm_call_hyp.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 60 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index b0b1478..9018fb3 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -357,6 +357,54 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
+{
+	struct kvm_host_data *host;
+	struct kvm_pmu_events *pmu;
+	u32 clr, set;
+
+	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
+	pmu = &host->pmu_events;
+
+	/* We can potentially avoid a sysreg write by only changing bits that
+	 * differ between the guest/host. E.g. where events are enabled in
+	 * both guest and host
+	 */
+	clr = pmu->events_host & ~pmu->events_guest;
+	set = pmu->events_guest & ~pmu->events_host;
+
+	if (clr)
+		write_sysreg(clr, pmcntenclr_el0);
+
+	if (set)
+		write_sysreg(set, pmcntenset_el0);
+
+	return (clr || set);
+}
+
+static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
+{
+	struct kvm_host_data *host;
+	struct kvm_pmu_events *pmu;
+	u32 clr, set;
+
+	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
+	pmu = &host->pmu_events;
+
+	/* We can potentially avoid a sysreg write by only changing bits that
+	 * differ between the guest/host. E.g. where events are enabled in
+	 * both guest and host
+	 */
+	clr = pmu->events_guest & ~pmu->events_host;
+	set = pmu->events_host & ~pmu->events_guest;
+
+	if (clr)
+		write_sysreg(clr, pmcntenclr_el0);
+
+	if (set)
+		write_sysreg(set, pmcntenset_el0);
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -464,12 +512,15 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
+	bool pmu_switch_needed;
 	u64 exit_code;
 
 	host_ctxt = vcpu->arch.host_cpu_context;
 	host_ctxt->__hyp_running_vcpu = vcpu;
 	guest_ctxt = &vcpu->arch.ctxt;
 
+	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
+
 	sysreg_save_host_state_vhe(host_ctxt);
 
 	/*
@@ -511,6 +562,9 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 
 	__debug_switch_to_host(vcpu);
 
+	if (pmu_switch_needed)
+		__pmu_switch_to_host(host_ctxt);
+
 	return exit_code;
 }
 
@@ -519,6 +573,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
+	bool pmu_switch_needed;
 	u64 exit_code;
 
 	vcpu = kern_hyp_va(vcpu);
@@ -527,6 +582,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 	host_ctxt->__hyp_running_vcpu = vcpu;
 	guest_ctxt = &vcpu->arch.ctxt;
 
+	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
+
 	__sysreg_save_state_nvhe(host_ctxt);
 
 	__activate_vm(kern_hyp_va(vcpu->kvm));
@@ -573,6 +630,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 	 */
 	__debug_switch_to_host(vcpu);
 
+	if (pmu_switch_needed)
+		__pmu_switch_to_host(host_ctxt);
+
 	return exit_code;
 }
 
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-14 16:11 ` [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
@ 2019-02-11 11:26   ` Will Deacon
  2019-02-18 21:53   ` Christoffer Dall
  1 sibling, 0 replies; 15+ messages in thread
From: Will Deacon @ 2019-02-11 11:26 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel

Hi Andrew,

On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> Add support for the :G and :H attributes in perf by handling the
> exclude_host/exclude_guest event attributes.
> 
> We notify KVM of counters that we wish to be enabled or disabled on
> guest entry/exit and thus defer from starting or stopping :G events
> as per the events exclude_host attribute.
> 
> With both VHE and non-VHE we switch the counters between host/guest
> at EL2. We are able to eliminate counters counting host events on
> the boundaries of guest entry/exit when using :G by filtering out
> EL2 for exclude_host. However when using :H unless exclude_hv is set
> on non-VHE then there is a small blackout window at the guest
> entry/exit where host events are not captured.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c | 53 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 46 insertions(+), 7 deletions(-)

[...]

>  static inline int armv8pmu_enable_intens(int idx)
> @@ -824,16 +851,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  	 * Therefore we ignore exclude_hv in this configuration, since
>  	 * there's no hypervisor to sample anyway. This is consistent
>  	 * with other architectures (x86 and Power).
> +	 *
> +	 * To eliminate counting host events on the boundaries of
> +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> +	 * with !exclude_host.
>  	 */
>  	if (is_kernel_in_hyp_mode()) {
> -		if (!attr->exclude_kernel)
> +		if (!attr->exclude_kernel && !attr->exclude_host)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;
>  	} else {
> -		if (attr->exclude_kernel)
> -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
>  		if (!attr->exclude_hv)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;
>  	}
> +
> +	/*
> +	 * Filter out !VHE kernels and guest kernels
> +	 */
> +	if (attr->exclude_kernel)
> +		config_base |= ARMV8_PMU_EXCLUDE_EL1;
> +

This series needs acks from the KVM folks, but I'd also like Christoffer to
confirm that he's happy with the way these checks ended up because I lost
track of where we got to on that.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-14 16:11 ` [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
  2019-02-11 11:26   ` Will Deacon
@ 2019-02-18 21:53   ` Christoffer Dall
  2019-02-20 16:15     ` Andrew Murray
  1 sibling, 1 reply; 15+ messages in thread
From: Christoffer Dall @ 2019-02-18 21:53 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> Add support for the :G and :H attributes in perf by handling the
> exclude_host/exclude_guest event attributes.
> 
> We notify KVM of counters that we wish to be enabled or disabled on
> guest entry/exit and thus defer from starting or stopping :G events
> as per the events exclude_host attribute.
> 
> With both VHE and non-VHE we switch the counters between host/guest
> at EL2. We are able to eliminate counters counting host events on
> the boundaries of guest entry/exit when using :G by filtering out
> EL2 for exclude_host. However when using :H unless exclude_hv is set
> on non-VHE then there is a small blackout window at the guest
> entry/exit where host events are not captured.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c | 53 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 46 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 1c71796..21c6831 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -26,6 +26,7 @@
>  
>  #include <linux/acpi.h>
>  #include <linux/clocksource.h>
> +#include <linux/kvm_host.h>
>  #include <linux/of.h>
>  #include <linux/perf/arm_pmu.h>
>  #include <linux/platform_device.h>
> @@ -528,11 +529,27 @@ static inline int armv8pmu_enable_counter(int idx)
>  
>  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
>  {
> +	struct perf_event_attr *attr = &event->attr;
>  	int idx = event->hw.idx;
> +	int flags = 0;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>  
> -	armv8pmu_enable_counter(idx);
>  	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_enable_counter(idx - 1);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	if (!attr->exclude_host)
> +		flags |= KVM_PMU_EVENTS_HOST;
> +	if (!attr->exclude_guest)
> +		flags |= KVM_PMU_EVENTS_GUEST;
> +
> +	kvm_set_pmu_events(counter_bits, flags);
> +
> +	/* We rely on the hypervisor switch code to enable guest counters */
> +	if (!attr->exclude_host) {
> +		armv8pmu_enable_counter(idx);
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_enable_counter(idx - 1);
> +	}
>  }
>  
>  static inline int armv8pmu_disable_counter(int idx)
> @@ -545,11 +562,21 @@ static inline int armv8pmu_disable_counter(int idx)
>  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
>  {
>  	struct hw_perf_event *hwc = &event->hw;
> +	struct perf_event_attr *attr = &event->attr;
>  	int idx = hwc->idx;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>  
>  	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_disable_counter(idx - 1);
> -	armv8pmu_disable_counter(idx);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	kvm_clr_pmu_events(counter_bits);
> +
> +	/* We rely on the hypervisor switch code to disable guest counters */
> +	if (!attr->exclude_host) {
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_disable_counter(idx - 1);
> +		armv8pmu_disable_counter(idx);
> +	}
>  }
>  
>  static inline int armv8pmu_enable_intens(int idx)
> @@ -824,16 +851,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  	 * Therefore we ignore exclude_hv in this configuration, since
>  	 * there's no hypervisor to sample anyway. This is consistent
>  	 * with other architectures (x86 and Power).
> +	 *
> +	 * To eliminate counting host events on the boundaries of
					   ^comma

> +	 * guest entry/exit we ensure EL2 is not included in hyp mode
			   ^comma (or rework sentence)

What do you mean by "EL2 is not included in hyp mode" ??

> +	 * with !exclude_host.
>  	 */
>  	if (is_kernel_in_hyp_mode()) {
> -		if (!attr->exclude_kernel)
> +		if (!attr->exclude_kernel && !attr->exclude_host)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;
>  	} else {
> -		if (attr->exclude_kernel)
> -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
>  		if (!attr->exclude_hv)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;
>  	}
> +
> +	/*
> +	 * Filter out !VHE kernels and guest kernels
> +	 */
> +	if (attr->exclude_kernel)
> +		config_base |= ARMV8_PMU_EXCLUDE_EL1;
> +

Let me see if I get this right:

	exclude_user:	VHE: Don't count EL0
			Non-VHE: Don't count EL0

	exclude_kernel: VHE: Don't count EL2 and don't count EL1
			Non-VHE: Don't count EL1

	exclude_hv:	VHE: No effect
			Non-VHE: Don't count EL2

	exclude_host:	VHE: Don't count EL2 + enable/disable on guest entry/exit
			Non-VHE: disable on guest entry/disable on guest entry/exit

And the logic I extract is that _user applies across both guest and
host, as does _kernel (regardless of the mode the kernel on the current
system runs in, might be only EL1, might be EL1 and EL2), and _hv is
specific to non-VHE systems to measure events in a specific piece of KVM
code that runs at EL2.

As I expressed before, that doesn't seem to be the intent behind the
exclude_hv flag, but I'm not sure how other architectures actually
implement things today, and even if it's a curiosity of the Arm
architecture and has value to non-VHE hypervisor hackers, and we don't
really have to care about uniformity with the other architectures, then
fine.

It has taken me a while to make sense of this code change, so I really
wish we can find a suitable place to document the semantics clearly for
perf users on arm64.

Now, another thing comes to mind:  Do we really need to enable and
disable anything on a VHE system on entry/exit to/from a guest?  Can we
instead do the following:

	exclude_host:	Disable EL2 counting
		     	Disable EL0 counting
		     	Enable EL0 counting on vcpu_load
		     	  (unless exclude_user is also set)
		     	Disable EL0 counting on vcpu_put

	exclude_guest:	Disable EL1 counting
		      	Disable EL0 counting on vcpu_load
		      	Enable EL0 counting on vcpu_put
			  (unless exclude_user is also set)

If that works, we can avoid the overhead in the critical path on VHE
systems and actually have slightly more accurate counting, leaving the
entry/exit operations to be specific to non-VHE.


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers
  2019-01-14 16:11 ` [PATCH v10 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
@ 2019-02-18 22:00   ` Christoffer Dall
  2019-03-04  9:40     ` Andrew Murray
  0 siblings, 1 reply; 15+ messages in thread
From: Christoffer Dall @ 2019-02-18 22:00 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Jan 14, 2019 at 04:11:48PM +0000, Andrew Murray wrote:
> Enable/disable event counters as appropriate when entering and exiting
> the guest to enable support for guest or host only event counting.
> 
> For both VHE and non-VHE we switch the counters between host/guest at
> EL2. EL2 is filtered out by the PMU when we are using the :G modifier.

I don't think the last part is strictly true as per the former patch on
a non-vhe system if you have the :h modifier, so maybe just leave that
out of the commit message.

> 
> The PMU may be on when we change which counters are enabled however
> we avoid adding an isb as we instead rely on existing context
> synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and
> the eret from the hvc in kvm_call_hyp.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/kvm/hyp/switch.c | 60 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index b0b1478..9018fb3 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -357,6 +357,54 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>  
> +static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
> +{
> +	struct kvm_host_data *host;
> +	struct kvm_pmu_events *pmu;
> +	u32 clr, set;
> +
> +	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> +	pmu = &host->pmu_events;
> +
> +	/* We can potentially avoid a sysreg write by only changing bits that
> +	 * differ between the guest/host. E.g. where events are enabled in
> +	 * both guest and host
> +	 */

super nit: kernel coding style requires 'wings' on both side of a
multi-line comment.  Only if you respin anyhow.

> +	clr = pmu->events_host & ~pmu->events_guest;
> +	set = pmu->events_guest & ~pmu->events_host;
> +
> +	if (clr)
> +		write_sysreg(clr, pmcntenclr_el0);
> +
> +	if (set)
> +		write_sysreg(set, pmcntenset_el0);
> +
> +	return (clr || set);
> +}
> +
> +static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
> +{
> +	struct kvm_host_data *host;
> +	struct kvm_pmu_events *pmu;
> +	u32 clr, set;
> +
> +	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> +	pmu = &host->pmu_events;
> +
> +	/* We can potentially avoid a sysreg write by only changing bits that
> +	 * differ between the guest/host. E.g. where events are enabled in
> +	 * both guest and host
> +	 */

ditto

> +	clr = pmu->events_guest & ~pmu->events_host;
> +	set = pmu->events_host & ~pmu->events_guest;
> +
> +	if (clr)
> +		write_sysreg(clr, pmcntenclr_el0);
> +
> +	if (set)
> +		write_sysreg(set, pmcntenset_el0);
> +}
> +
>  /*
>   * Return true when we were able to fixup the guest exit and should return to
>   * the guest, false when we should restore the host state and return to the
> @@ -464,12 +512,15 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpu_context *host_ctxt;
>  	struct kvm_cpu_context *guest_ctxt;
> +	bool pmu_switch_needed;
>  	u64 exit_code;
>  
>  	host_ctxt = vcpu->arch.host_cpu_context;
>  	host_ctxt->__hyp_running_vcpu = vcpu;
>  	guest_ctxt = &vcpu->arch.ctxt;
>  
> +	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
> +
>  	sysreg_save_host_state_vhe(host_ctxt);
>  
>  	/*
> @@ -511,6 +562,9 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  
>  	__debug_switch_to_host(vcpu);
>  
> +	if (pmu_switch_needed)
> +		__pmu_switch_to_host(host_ctxt);
> +
>  	return exit_code;
>  }
>  
> @@ -519,6 +573,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpu_context *host_ctxt;
>  	struct kvm_cpu_context *guest_ctxt;
> +	bool pmu_switch_needed;
>  	u64 exit_code;
>  
>  	vcpu = kern_hyp_va(vcpu);
> @@ -527,6 +582,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  	host_ctxt->__hyp_running_vcpu = vcpu;
>  	guest_ctxt = &vcpu->arch.ctxt;
>  
> +	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
> +
>  	__sysreg_save_state_nvhe(host_ctxt);
>  
>  	__activate_vm(kern_hyp_va(vcpu->kvm));
> @@ -573,6 +630,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  	 */
>  	__debug_switch_to_host(vcpu);
>  
> +	if (pmu_switch_needed)
> +		__pmu_switch_to_host(host_ctxt);
> +
>  	return exit_code;
>  }
>  
> -- 
> 2.7.4
> 

Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-02-18 21:53   ` Christoffer Dall
@ 2019-02-20 16:15     ` Andrew Murray
  2019-02-26 12:44       ` Christoffer Dall
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Murray @ 2019-02-20 16:15 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Feb 18, 2019 at 10:53:07PM +0100, Christoffer Dall wrote:
> On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> > Add support for the :G and :H attributes in perf by handling the
> > exclude_host/exclude_guest event attributes.
> > 
> > We notify KVM of counters that we wish to be enabled or disabled on
> > guest entry/exit and thus defer from starting or stopping :G events
> > as per the events exclude_host attribute.
> > 
> > With both VHE and non-VHE we switch the counters between host/guest
> > at EL2. We are able to eliminate counters counting host events on
> > the boundaries of guest entry/exit when using :G by filtering out
> > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > on non-VHE then there is a small blackout window at the guest
> > entry/exit where host events are not captured.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > ---
> >  arch/arm64/kernel/perf_event.c | 53 ++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 46 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 1c71796..21c6831 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -26,6 +26,7 @@
> >  
> >  #include <linux/acpi.h>
> >  #include <linux/clocksource.h>
> > +#include <linux/kvm_host.h>
> >  #include <linux/of.h>
> >  #include <linux/perf/arm_pmu.h>
> >  #include <linux/platform_device.h>
> > @@ -528,11 +529,27 @@ static inline int armv8pmu_enable_counter(int idx)
> >  
> >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> >  {
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = event->hw.idx;
> > +	int flags = 0;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> > -	armv8pmu_enable_counter(idx);
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_enable_counter(idx - 1);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (!attr->exclude_host)
> > +		flags |= KVM_PMU_EVENTS_HOST;
> > +	if (!attr->exclude_guest)
> > +		flags |= KVM_PMU_EVENTS_GUEST;
> > +
> > +	kvm_set_pmu_events(counter_bits, flags);
> > +
> > +	/* We rely on the hypervisor switch code to enable guest counters */
> > +	if (!attr->exclude_host) {
> > +		armv8pmu_enable_counter(idx);
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_enable_counter(idx - 1);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_disable_counter(int idx)
> > @@ -545,11 +562,21 @@ static inline int armv8pmu_disable_counter(int idx)
> >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> >  {
> >  	struct hw_perf_event *hwc = &event->hw;
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = hwc->idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_disable_counter(idx - 1);
> > -	armv8pmu_disable_counter(idx);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	kvm_clr_pmu_events(counter_bits);
> > +
> > +	/* We rely on the hypervisor switch code to disable guest counters */
> > +	if (!attr->exclude_host) {
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_disable_counter(idx - 1);
> > +		armv8pmu_disable_counter(idx);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_enable_intens(int idx)
> > @@ -824,16 +851,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> >  	 * Therefore we ignore exclude_hv in this configuration, since
> >  	 * there's no hypervisor to sample anyway. This is consistent
> >  	 * with other architectures (x86 and Power).
> > +	 *
> > +	 * To eliminate counting host events on the boundaries of
> 					   ^comma
> 
> > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> 			   ^comma (or rework sentence)
> 
> What do you mean by "EL2 is not included in hyp mode" ??

This attempts to explain the addition of '!attr->exclude_host' when
is_kernel_in_hyp_mode is true below.

We have no need to count EL2 when counting any guest events - by adding
this hunk we can eliminate counting host events in the small period of
time in the switch code between enabling the counter and entering the
guest.

Perhaps I should rephrase it to "To eliminate counting host events on
the boundaries of guest entry/exit, we ensure EL2 is not included when
counting guest events in hyp mode." ?

> 
> > +	 * with !exclude_host.
> >  	 */
> >  	if (is_kernel_in_hyp_mode()) {
> > -		if (!attr->exclude_kernel)
> > +		if (!attr->exclude_kernel && !attr->exclude_host)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> >  	} else {
> > -		if (attr->exclude_kernel)
> > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> >  		if (!attr->exclude_hv)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> >  	}
> > +
> > +	/*
> > +	 * Filter out !VHE kernels and guest kernels
> > +	 */
> > +	if (attr->exclude_kernel)
> > +		config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > +
> 
> Let me see if I get this right:
> 
> 	exclude_user:	VHE: Don't count EL0
> 			Non-VHE: Don't count EL0
> 
> 	exclude_kernel: VHE: Don't count EL2 and don't count EL1
> 			Non-VHE: Don't count EL1
> 
> 	exclude_hv:	VHE: No effect
> 			Non-VHE: Don't count EL2
> 
> 	exclude_host:	VHE: Don't count EL2 + enable/disable on guest entry/exit
> 			Non-VHE: disable on guest entry/disable on guest entry/exit
> 
> And the logic I extract is that _user applies across both guest and
> host, as does _kernel (regardless of the mode the kernel on the current
> system runs in, might be only EL1, might be EL1 and EL2), and _hv is
> specific to non-VHE systems to measure events in a specific piece of KVM
> code that runs at EL2.
> 
> As I expressed before, that doesn't seem to be the intent behind the
> exclude_hv flag, but I'm not sure how other architectures actually
> implement things today, and even if it's a curiosity of the Arm
> architecture and has value to non-VHE hypervisor hackers, and we don't
> really have to care about uniformity with the other architectures, then
> fine.
> 
> It has taken me a while to make sense of this code change, so I really
> wish we can find a suitable place to document the semantics clearly for
> perf users on arm64.
> 
> Now, another thing comes to mind:  Do we really need to enable and
> disable anything on a VHE system on entry/exit to/from a guest?  Can we
> instead do the following:
> 
> 	exclude_host:	Disable EL2 counting
> 		     	Disable EL0 counting
> 		     	Enable EL0 counting on vcpu_load
> 		     	  (unless exclude_user is also set)
> 		     	Disable EL0 counting on vcpu_put
> 
> 	exclude_guest:	Disable EL1 counting
> 		      	Disable EL0 counting on vcpu_load
> 		      	Enable EL0 counting on vcpu_put
> 			  (unless exclude_user is also set)
> 
> If that works, we can avoid the overhead in the critical path on VHE
> systems and actually have slightly more accurate counting, leaving the
> entry/exit operations to be specific to non-VHE.

This all makes sense.

At present on VHE, for host only events, there is a small blackout window
at the guest entry/exit - this is where we turn off/on host counting before
entering/exiting the guest. (This blackout window also exists on !VHE unless
exclude_hv is set).

To mitigate against this the PMU switching code was brought as close to the
guest entry/exit as possible - but as you point out at this point in
kvm_arch_vcpu_ioctl_run we're running without interrupts/preemption and so
on the critical path. I believe we add about 11 instructions when there are
no PMU counters enabled and about 23 when they are enabled. I suppose it
would be possible to use static keys to reduce the overhead when counters
are not enabled...

Your suggestion provides an optimal solution, however it adds some
complexity - here is a rough attempt at implementing it...


diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index cbfe3d1..bc548e6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -482,10 +482,23 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 {
        return kvm_arch_vcpu_run_map_fp(vcpu);
 }
-static inline void kvm_set_pmu_events(u32 set, int flags)
+static inline bool kvm_pmu_switch_events(struct perf_event_attr *attr)
+{
+       if (!has_vhe())
+               return true;
+
+       if (attr->exclude_user)
+               return false;
+
+       return (attr->exclude_host ^ attr->exclude_guest);
+}
+static inline void kvm_set_pmu_events(u32 set, int flags, struct perf_event_attr *attr)
 {
        struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
 
+       if (!kvm_pmu_switch_events(attr))
+               return;
+
        if (flags & KVM_PMU_EVENTS_HOST)
                ctx->pmu_events.events_host |= set;
        if (flags & KVM_PMU_EVENTS_GUEST)
@@ -499,6 +512,7 @@ static inline void kvm_clr_pmu_events(u32 clr)
        ctx->pmu_events.events_guest &= ~clr;
 }
 #else
+static inline bool kvm_pmu_switch_events() { return false; }
 static inline void kvm_set_pmu_events(u32 set, int flags) {}
 static inline void kvm_clr_pmu_events(u32 clr) {}
 #endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 21c6831..dae6691 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -542,10 +542,10 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
        if (!attr->exclude_guest)
                flags |= KVM_PMU_EVENTS_GUEST;
 
-       kvm_set_pmu_events(counter_bits, flags);
+       kvm_set_pmu_events(counter_bits, flags, attr);
 
        /* We rely on the hypervisor switch code to enable guest counters */
-       if (!attr->exclude_host) {
+       if (has_vhe() || !attr->exclude_host) {
                armv8pmu_enable_counter(idx);
                if (armv8pmu_event_is_chained(event))
                        armv8pmu_enable_counter(idx - 1);
@@ -572,7 +572,7 @@ static inline void armv8pmu_disable_event_counter(struct perf_event *event)
        kvm_clr_pmu_events(counter_bits);
 
        /* We rely on the hypervisor switch code to disable guest counters */
-       if (!attr->exclude_host) {
+       if (has_vhe() || !attr->exclude_host) {
                if (armv8pmu_event_is_chained(event))
                        armv8pmu_disable_counter(idx - 1);
                armv8pmu_disable_counter(idx);
@@ -859,6 +859,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
        if (is_kernel_in_hyp_mode()) {
                if (!attr->exclude_kernel && !attr->exclude_host)
                        config_base |= ARMV8_PMU_INCLUDE_EL2;
+               if (attr->exclude_guest)
+                       config_base |= ARMV8_PMU_EXCLUDE_EL1;
+               if (attr->exclude_host)
+                       config_base |= ARMV8_PMU_EXCLUDE_EL0;
        } else {
                if (!attr->exclude_hv)
                        config_base |= ARMV8_PMU_INCLUDE_EL2;
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 9018fb3..722cd7a 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -512,15 +512,12 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
        struct kvm_cpu_context *host_ctxt;
        struct kvm_cpu_context *guest_ctxt;
-       bool pmu_switch_needed;
        u64 exit_code;
 
        host_ctxt = vcpu->arch.host_cpu_context;
        host_ctxt->__hyp_running_vcpu = vcpu;
        guest_ctxt = &vcpu->arch.ctxt;
 
-       pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
-
        sysreg_save_host_state_vhe(host_ctxt);
 
        /*
@@ -562,9 +559,6 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 
        __debug_switch_to_host(vcpu);
 
-       if (pmu_switch_needed)
-               __pmu_switch_to_host(host_ctxt);
-
        return exit_code;
 }
 
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 89acb7f..34d94ba 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -360,6 +360,42 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
        return kvm_vgic_vcpu_init(vcpu);
 }
 
+static void kvm_vcpu_pmu_switch(struct kvm_vcpu *vcpu, bool guest)
+{
+       u32 typer, counter;
+       struct kvm_cpu_context *host_ctxt;
+       struct kvm_host_data *host;
+       unsigned long events_guest, events_host;
+
+       host_ctxt = vcpu->arch.host_cpu_context;
+       host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
+       events_guest = host->pmu_events.events_guest;
+       events_host = host->pmu_events.events_host;
+
+       if (!has_vhe())
+               return;
+
+       for_each_set_bit(counter, &events_guest, 32) {
+               write_sysreg(counter, pmselr_el0);
+               isb();
+               if (guest)
+                       typer = read_sysreg(pmxevtyper_el0) & ~BIT(30);
+               else
+                       typer = read_sysreg(pmxevtyper_el0) | BIT(30);
+               write_sysreg(typer, pmxevtyper_el0);
+       }
+
+       for_each_set_bit(counter, &events_host, 32) {
+               write_sysreg(counter, pmselr_el0);
+               isb();
+               if (guest)
+                       typer = read_sysreg(pmxevtyper_el0) | BIT(30);
+               else
+                       typer = read_sysreg(pmxevtyper_el0) & ~BIT(30);
+               write_sysreg(typer, pmxevtyper_el0);
+       }
+}
+
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
        int *last_ran;
@@ -385,6 +421,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
        kvm_timer_vcpu_load(vcpu);
        kvm_vcpu_load_sysregs(vcpu);
        kvm_arch_vcpu_load_fp(vcpu);
+       kvm_vcpu_pmu_switch(vcpu, true);
 
        if (single_task_running())
                vcpu_clear_wfe_traps(vcpu);
@@ -398,6 +435,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
        kvm_vcpu_put_sysregs(vcpu);
        kvm_timer_vcpu_put(vcpu);
        kvm_vgic_put(vcpu);
+       kvm_vcpu_pmu_switch(vcpu, false);
 
        vcpu->cpu = -1;
 

Do you think this is worth developing further?

Thanks,

Andrew Murray

> 
> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-02-20 16:15     ` Andrew Murray
@ 2019-02-26 12:44       ` Christoffer Dall
  2019-03-04 11:14         ` Andrew Murray
  0 siblings, 1 reply; 15+ messages in thread
From: Christoffer Dall @ 2019-02-26 12:44 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Feb 20, 2019 at 04:15:40PM +0000, Andrew Murray wrote:
> On Mon, Feb 18, 2019 at 10:53:07PM +0100, Christoffer Dall wrote:
> > On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> > > Add support for the :G and :H attributes in perf by handling the
> > > exclude_host/exclude_guest event attributes.
> > > 
> > > We notify KVM of counters that we wish to be enabled or disabled on
> > > guest entry/exit and thus defer from starting or stopping :G events
> > > as per the events exclude_host attribute.
> > > 
> > > With both VHE and non-VHE we switch the counters between host/guest
> > > at EL2. We are able to eliminate counters counting host events on
> > > the boundaries of guest entry/exit when using :G by filtering out
> > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > on non-VHE then there is a small blackout window at the guest
> > > entry/exit where host events are not captured.
> > > 
> > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > ---
> > >  arch/arm64/kernel/perf_event.c | 53 ++++++++++++++++++++++++++++++++++++------
> > >  1 file changed, 46 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > index 1c71796..21c6831 100644
> > > --- a/arch/arm64/kernel/perf_event.c
> > > +++ b/arch/arm64/kernel/perf_event.c
> > > @@ -26,6 +26,7 @@
> > >  
> > >  #include <linux/acpi.h>
> > >  #include <linux/clocksource.h>
> > > +#include <linux/kvm_host.h>
> > >  #include <linux/of.h>
> > >  #include <linux/perf/arm_pmu.h>
> > >  #include <linux/platform_device.h>
> > > @@ -528,11 +529,27 @@ static inline int armv8pmu_enable_counter(int idx)
> > >  
> > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > >  {
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = event->hw.idx;
> > > +	int flags = 0;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > > -	armv8pmu_enable_counter(idx);
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_enable_counter(idx - 1);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	if (!attr->exclude_host)
> > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > +	if (!attr->exclude_guest)
> > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > +
> > > +	kvm_set_pmu_events(counter_bits, flags);
> > > +
> > > +	/* We rely on the hypervisor switch code to enable guest counters */
> > > +	if (!attr->exclude_host) {
> > > +		armv8pmu_enable_counter(idx);
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_enable_counter(idx - 1);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_disable_counter(int idx)
> > > @@ -545,11 +562,21 @@ static inline int armv8pmu_disable_counter(int idx)
> > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > >  {
> > >  	struct hw_perf_event *hwc = &event->hw;
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = hwc->idx;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_disable_counter(idx - 1);
> > > -	armv8pmu_disable_counter(idx);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	kvm_clr_pmu_events(counter_bits);
> > > +
> > > +	/* We rely on the hypervisor switch code to disable guest counters */
> > > +	if (!attr->exclude_host) {
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_disable_counter(idx - 1);
> > > +		armv8pmu_disable_counter(idx);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_enable_intens(int idx)
> > > @@ -824,16 +851,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > >  	 * there's no hypervisor to sample anyway. This is consistent
> > >  	 * with other architectures (x86 and Power).
> > > +	 *
> > > +	 * To eliminate counting host events on the boundaries of
> > 					   ^comma
> > 
> > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > 			   ^comma (or rework sentence)
> > 
> > What do you mean by "EL2 is not included in hyp mode" ??
> 
> This attempts to explain the addition of '!attr->exclude_host' when
> is_kernel_in_hyp_mode is true below.

Ahh, so you meant to say "EL2 is not included when the kernel runs
entirely in hyp mode", however, ...

> 
> We have no need to count EL2 when counting any guest events - by adding
> this hunk we can eliminate counting host events in the small period of
> time in the switch code between enabling the counter and entering the
> guest.
> 
> Perhaps I should rephrase it to "To eliminate counting host events on
> the boundaries of guest entry/exit, we ensure EL2 is not included when
> counting guest events in hyp mode." ?
> 

... I don't think that really helps.  This comment is trying to
translate code flow into English, which is generally a bad idea.  As I
read the code, what this does is disabling counting at EL2 when
is_kernel_in_hyp_mode() is true, which means you won't count in general
in the host kernel, nor in the world-switch in KVM, which makes sense.

I recommend you drop the comment, and instead focus on explaining the
overall semantics of the flags somewhere (perhaps my write-up below can
be useful in that regard).

> > 
> > > +	 * with !exclude_host.
> > >  	 */
> > >  	if (is_kernel_in_hyp_mode()) {
> > > -		if (!attr->exclude_kernel)
> > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > >  	} else {
> > > -		if (attr->exclude_kernel)
> > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > >  		if (!attr->exclude_hv)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > >  	}
> > > +
> > > +	/*
> > > +	 * Filter out !VHE kernels and guest kernels
> > > +	 */
> > > +	if (attr->exclude_kernel)
> > > +		config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > +
> > 
> > Let me see if I get this right:
> > 
> > 	exclude_user:	VHE: Don't count EL0
> > 			Non-VHE: Don't count EL0
> > 
> > 	exclude_kernel: VHE: Don't count EL2 and don't count EL1
> > 			Non-VHE: Don't count EL1
> > 
> > 	exclude_hv:	VHE: No effect
> > 			Non-VHE: Don't count EL2
> > 
> > 	exclude_host:	VHE: Don't count EL2 + enable/disable on guest entry/exit
> > 			Non-VHE: disable on guest entry/disable on guest entry/exit
> > 
> > And the logic I extract is that _user applies across both guest and
> > host, as does _kernel (regardless of the mode the kernel on the current
> > system runs in, might be only EL1, might be EL1 and EL2), and _hv is
> > specific to non-VHE systems to measure events in a specific piece of KVM
> > code that runs at EL2.
> > 
> > As I expressed before, that doesn't seem to be the intent behind the
> > exclude_hv flag, but I'm not sure how other architectures actually
> > implement things today, and even if it's a curiosity of the Arm
> > architecture and has value to non-VHE hypervisor hackers, and we don't
> > really have to care about uniformity with the other architectures, then
> > fine.
> > 
> > It has taken me a while to make sense of this code change, so I really
> > wish we can find a suitable place to document the semantics clearly for
> > perf users on arm64.
> > 
> > Now, another thing comes to mind:  Do we really need to enable and
> > disable anything on a VHE system on entry/exit to/from a guest?  Can we
> > instead do the following:
> > 
> > 	exclude_host:	Disable EL2 counting
> > 		     	Disable EL0 counting
> > 		     	Enable EL0 counting on vcpu_load
> > 		     	  (unless exclude_user is also set)
> > 		     	Disable EL0 counting on vcpu_put
> > 
> > 	exclude_guest:	Disable EL1 counting
> > 		      	Disable EL0 counting on vcpu_load
> > 		      	Enable EL0 counting on vcpu_put
> > 			  (unless exclude_user is also set)
> > 
> > If that works, we can avoid the overhead in the critical path on VHE
> > systems and actually have slightly more accurate counting, leaving the
> > entry/exit operations to be specific to non-VHE.
> 
> This all makes sense.
> 
> At present on VHE, for host only events, there is a small blackout window
> at the guest entry/exit - this is where we turn off/on host counting before
> entering/exiting the guest. (This blackout window also exists on !VHE unless
> exclude_hv is set).
> 
> To mitigate against this the PMU switching code was brought as close to the
> guest entry/exit as possible - but as you point out at this point in
> kvm_arch_vcpu_ioctl_run we're running without interrupts/preemption and so
> on the critical path. I believe we add about 11 instructions when there are
> no PMU counters enabled and about 23 when they are enabled. I suppose it
> would be possible to use static keys to reduce the overhead when counters
> are not enabled...
> 
> Your suggestion provides an optimal solution, however it adds some
> complexity - here is a rough attempt at implementing it...
> 
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index cbfe3d1..bc548e6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -482,10 +482,23 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
>  {
>         return kvm_arch_vcpu_run_map_fp(vcpu);
>  }
> -static inline void kvm_set_pmu_events(u32 set, int flags)
> +static inline bool kvm_pmu_switch_events(struct perf_event_attr *attr)
> +{
> +       if (!has_vhe())
> +               return true;
> +
> +       if (attr->exclude_user)
> +               return false;
> +
> +       return (attr->exclude_host ^ attr->exclude_guest);
> +}
> +static inline void kvm_set_pmu_events(u32 set, int flags, struct perf_event_attr *attr)
>  {
>         struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
>  
> +       if (!kvm_pmu_switch_events(attr))
> +               return;
> +
>         if (flags & KVM_PMU_EVENTS_HOST)
>                 ctx->pmu_events.events_host |= set;
>         if (flags & KVM_PMU_EVENTS_GUEST)
> @@ -499,6 +512,7 @@ static inline void kvm_clr_pmu_events(u32 clr)
>         ctx->pmu_events.events_guest &= ~clr;
>  }
>  #else
> +static inline bool kvm_pmu_switch_events() { return false; }
>  static inline void kvm_set_pmu_events(u32 set, int flags) {}
>  static inline void kvm_clr_pmu_events(u32 clr) {}
>  #endif
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 21c6831..dae6691 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -542,10 +542,10 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
>         if (!attr->exclude_guest)
>                 flags |= KVM_PMU_EVENTS_GUEST;
>  
> -       kvm_set_pmu_events(counter_bits, flags);
> +       kvm_set_pmu_events(counter_bits, flags, attr);
>  
>         /* We rely on the hypervisor switch code to enable guest counters */
> -       if (!attr->exclude_host) {
> +       if (has_vhe() || !attr->exclude_host) {
>                 armv8pmu_enable_counter(idx);
>                 if (armv8pmu_event_is_chained(event))
>                         armv8pmu_enable_counter(idx - 1);
> @@ -572,7 +572,7 @@ static inline void armv8pmu_disable_event_counter(struct perf_event *event)
>         kvm_clr_pmu_events(counter_bits);
>  
>         /* We rely on the hypervisor switch code to disable guest counters */
> -       if (!attr->exclude_host) {
> +       if (has_vhe() || !attr->exclude_host) {
>                 if (armv8pmu_event_is_chained(event))
>                         armv8pmu_disable_counter(idx - 1);
>                 armv8pmu_disable_counter(idx);
> @@ -859,6 +859,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>         if (is_kernel_in_hyp_mode()) {
>                 if (!attr->exclude_kernel && !attr->exclude_host)
>                         config_base |= ARMV8_PMU_INCLUDE_EL2;
> +               if (attr->exclude_guest)
> +                       config_base |= ARMV8_PMU_EXCLUDE_EL1;
> +               if (attr->exclude_host)
> +                       config_base |= ARMV8_PMU_EXCLUDE_EL0;
>         } else {
>                 if (!attr->exclude_hv)
>                         config_base |= ARMV8_PMU_INCLUDE_EL2;
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 9018fb3..722cd7a 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -512,15 +512,12 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  {
>         struct kvm_cpu_context *host_ctxt;
>         struct kvm_cpu_context *guest_ctxt;
> -       bool pmu_switch_needed;
>         u64 exit_code;
>  
>         host_ctxt = vcpu->arch.host_cpu_context;
>         host_ctxt->__hyp_running_vcpu = vcpu;
>         guest_ctxt = &vcpu->arch.ctxt;
>  
> -       pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
> -
>         sysreg_save_host_state_vhe(host_ctxt);
>  
>         /*
> @@ -562,9 +559,6 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  
>         __debug_switch_to_host(vcpu);
>  
> -       if (pmu_switch_needed)
> -               __pmu_switch_to_host(host_ctxt);
> -
>         return exit_code;
>  }
>  
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 89acb7f..34d94ba 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -360,6 +360,42 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>         return kvm_vgic_vcpu_init(vcpu);
>  }
>  
> +static void kvm_vcpu_pmu_switch(struct kvm_vcpu *vcpu, bool guest)
> +{
> +       u32 typer, counter;
> +       struct kvm_cpu_context *host_ctxt;
> +       struct kvm_host_data *host;
> +       unsigned long events_guest, events_host;
> +
> +       host_ctxt = vcpu->arch.host_cpu_context;
> +       host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> +       events_guest = host->pmu_events.events_guest;
> +       events_host = host->pmu_events.events_host;
> +
> +       if (!has_vhe())
> +               return;
> +
> +       for_each_set_bit(counter, &events_guest, 32) {
> +               write_sysreg(counter, pmselr_el0);
> +               isb();
> +               if (guest)
> +                       typer = read_sysreg(pmxevtyper_el0) & ~BIT(30);
> +               else
> +                       typer = read_sysreg(pmxevtyper_el0) | BIT(30);
> +               write_sysreg(typer, pmxevtyper_el0);
> +       }
> +
> +       for_each_set_bit(counter, &events_host, 32) {
> +               write_sysreg(counter, pmselr_el0);
> +               isb();
> +               if (guest)
> +                       typer = read_sysreg(pmxevtyper_el0) | BIT(30);
> +               else
> +                       typer = read_sysreg(pmxevtyper_el0) & ~BIT(30);
> +               write_sysreg(typer, pmxevtyper_el0);
> +       }
> +}
> +
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  {
>         int *last_ran;
> @@ -385,6 +421,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>         kvm_timer_vcpu_load(vcpu);
>         kvm_vcpu_load_sysregs(vcpu);
>         kvm_arch_vcpu_load_fp(vcpu);
> +       kvm_vcpu_pmu_switch(vcpu, true);
>  
>         if (single_task_running())
>                 vcpu_clear_wfe_traps(vcpu);
> @@ -398,6 +435,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>         kvm_vcpu_put_sysregs(vcpu);
>         kvm_timer_vcpu_put(vcpu);
>         kvm_vgic_put(vcpu);
> +       kvm_vcpu_pmu_switch(vcpu, false);
>  
>         vcpu->cpu = -1;
>  
> 
> Do you think this is worth developing further?
> 

Yes, I think it's worth going to fairly great lengths to keep the
critical path on VHE systems lean, and I suspect if we take the easy way
to add functionality first, it will only be harder to factor things out
later.

I'd like to see some (tested) version of this patch if possible.


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers
  2019-02-18 22:00   ` Christoffer Dall
@ 2019-03-04  9:40     ` Andrew Murray
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Murray @ 2019-03-04  9:40 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Feb 18, 2019 at 11:00:19PM +0100, Christoffer Dall wrote:
> On Mon, Jan 14, 2019 at 04:11:48PM +0000, Andrew Murray wrote:
> > Enable/disable event counters as appropriate when entering and exiting
> > the guest to enable support for guest or host only event counting.
> > 
> > For both VHE and non-VHE we switch the counters between host/guest at
> > EL2. EL2 is filtered out by the PMU when we are using the :G modifier.
> 
> I don't think the last part is strictly true as per the former patch on
> a non-vhe system if you have the :h modifier, so maybe just leave that
> out of the commit message.

OK I'll remove that.

> 
> > 
> > The PMU may be on when we change which counters are enabled however
> > we avoid adding an isb as we instead rely on existing context
> > synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and
> > the eret from the hvc in kvm_call_hyp.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > ---
> >  arch/arm64/kvm/hyp/switch.c | 60 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 60 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index b0b1478..9018fb3 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -357,6 +357,54 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >  	return true;
> >  }
> >  
> > +static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
> > +{
> > +	struct kvm_host_data *host;
> > +	struct kvm_pmu_events *pmu;
> > +	u32 clr, set;
> > +
> > +	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> > +	pmu = &host->pmu_events;
> > +
> > +	/* We can potentially avoid a sysreg write by only changing bits that
> > +	 * differ between the guest/host. E.g. where events are enabled in
> > +	 * both guest and host
> > +	 */
> 
> super nit: kernel coding style requires 'wings' on both side of a
> multi-line comment.  Only if you respin anyhow.

Ah I didn't notice that, I'll fix this up.

Thanks for the review.

Andrew Murray

> 
> > +	clr = pmu->events_host & ~pmu->events_guest;
> > +	set = pmu->events_guest & ~pmu->events_host;
> > +
> > +	if (clr)
> > +		write_sysreg(clr, pmcntenclr_el0);
> > +
> > +	if (set)
> > +		write_sysreg(set, pmcntenset_el0);
> > +
> > +	return (clr || set);
> > +}
> > +
> > +static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
> > +{
> > +	struct kvm_host_data *host;
> > +	struct kvm_pmu_events *pmu;
> > +	u32 clr, set;
> > +
> > +	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> > +	pmu = &host->pmu_events;
> > +
> > +	/* We can potentially avoid a sysreg write by only changing bits that
> > +	 * differ between the guest/host. E.g. where events are enabled in
> > +	 * both guest and host
> > +	 */
> 
> ditto
> 
> > +	clr = pmu->events_guest & ~pmu->events_host;
> > +	set = pmu->events_host & ~pmu->events_guest;
> > +
> > +	if (clr)
> > +		write_sysreg(clr, pmcntenclr_el0);
> > +
> > +	if (set)
> > +		write_sysreg(set, pmcntenset_el0);
> > +}
> > +
> >  /*
> >   * Return true when we were able to fixup the guest exit and should return to
> >   * the guest, false when we should restore the host state and return to the
> > @@ -464,12 +512,15 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
> >  {
> >  	struct kvm_cpu_context *host_ctxt;
> >  	struct kvm_cpu_context *guest_ctxt;
> > +	bool pmu_switch_needed;
> >  	u64 exit_code;
> >  
> >  	host_ctxt = vcpu->arch.host_cpu_context;
> >  	host_ctxt->__hyp_running_vcpu = vcpu;
> >  	guest_ctxt = &vcpu->arch.ctxt;
> >  
> > +	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
> > +
> >  	sysreg_save_host_state_vhe(host_ctxt);
> >  
> >  	/*
> > @@ -511,6 +562,9 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
> >  
> >  	__debug_switch_to_host(vcpu);
> >  
> > +	if (pmu_switch_needed)
> > +		__pmu_switch_to_host(host_ctxt);
> > +
> >  	return exit_code;
> >  }
> >  
> > @@ -519,6 +573,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
> >  {
> >  	struct kvm_cpu_context *host_ctxt;
> >  	struct kvm_cpu_context *guest_ctxt;
> > +	bool pmu_switch_needed;
> >  	u64 exit_code;
> >  
> >  	vcpu = kern_hyp_va(vcpu);
> > @@ -527,6 +582,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
> >  	host_ctxt->__hyp_running_vcpu = vcpu;
> >  	guest_ctxt = &vcpu->arch.ctxt;
> >  
> > +	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
> > +
> >  	__sysreg_save_state_nvhe(host_ctxt);
> >  
> >  	__activate_vm(kern_hyp_va(vcpu->kvm));
> > @@ -573,6 +630,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
> >  	 */
> >  	__debug_switch_to_host(vcpu);
> >  
> > +	if (pmu_switch_needed)
> > +		__pmu_switch_to_host(host_ctxt);
> > +
> >  	return exit_code;
> >  }
> >  
> > -- 
> > 2.7.4
> > 
> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-02-26 12:44       ` Christoffer Dall
@ 2019-03-04 11:14         ` Andrew Murray
  2019-03-05 11:45           ` Andrew Murray
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Murray @ 2019-03-04 11:14 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Feb 26, 2019 at 01:44:59PM +0100, Christoffer Dall wrote:
> On Wed, Feb 20, 2019 at 04:15:40PM +0000, Andrew Murray wrote:
> > On Mon, Feb 18, 2019 at 10:53:07PM +0100, Christoffer Dall wrote:
> > > On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> > > > Add support for the :G and :H attributes in perf by handling the
> > > > exclude_host/exclude_guest event attributes.
> > > > 
> > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > as per the events exclude_host attribute.
> > > > 
> > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > at EL2. We are able to eliminate counters counting host events on
> > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > on non-VHE then there is a small blackout window at the guest
> > > > entry/exit where host events are not captured.
> > > > 
> > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > ---
> > > >  arch/arm64/kernel/perf_event.c | 53 ++++++++++++++++++++++++++++++++++++------
> > > >  1 file changed, 46 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > index 1c71796..21c6831 100644
> > > > --- a/arch/arm64/kernel/perf_event.c
> > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > @@ -26,6 +26,7 @@
> > > >  
> > > >  #include <linux/acpi.h>
> > > >  #include <linux/clocksource.h>
> > > > +#include <linux/kvm_host.h>
> > > >  #include <linux/of.h>
> > > >  #include <linux/perf/arm_pmu.h>
> > > >  #include <linux/platform_device.h>
> > > > @@ -528,11 +529,27 @@ static inline int armv8pmu_enable_counter(int idx)
> > > >  
> > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > >  {
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = event->hw.idx;
> > > > +	int flags = 0;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > > -	armv8pmu_enable_counter(idx);
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_enable_counter(idx - 1);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	if (!attr->exclude_host)
> > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > +	if (!attr->exclude_guest)
> > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > +
> > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > +
> > > > +	/* We rely on the hypervisor switch code to enable guest counters */
> > > > +	if (!attr->exclude_host) {
> > > > +		armv8pmu_enable_counter(idx);
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_enable_counter(idx - 1);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > @@ -545,11 +562,21 @@ static inline int armv8pmu_disable_counter(int idx)
> > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > >  {
> > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = hwc->idx;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_disable_counter(idx - 1);
> > > > -	armv8pmu_disable_counter(idx);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	kvm_clr_pmu_events(counter_bits);
> > > > +
> > > > +	/* We rely on the hypervisor switch code to disable guest counters */
> > > > +	if (!attr->exclude_host) {
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_disable_counter(idx - 1);
> > > > +		armv8pmu_disable_counter(idx);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > @@ -824,16 +851,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > >  	 * with other architectures (x86 and Power).
> > > > +	 *
> > > > +	 * To eliminate counting host events on the boundaries of
> > > 					   ^comma
> > > 
> > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > 			   ^comma (or rework sentence)
> > > 
> > > What do you mean by "EL2 is not included in hyp mode" ??
> > 
> > This attempts to explain the addition of '!attr->exclude_host' when
> > is_kernel_in_hyp_mode is true below.
> 
> Ahh, so you meant to say "EL2 is not included when the kernel runs
> entirely in hyp mode", however, ...
> 
> > 
> > We have no need to count EL2 when counting any guest events - by adding
> > this hunk we can eliminate counting host events in the small period of
> > time in the switch code between enabling the counter and entering the
> > guest.
> > 
> > Perhaps I should rephrase it to "To eliminate counting host events on
> > the boundaries of guest entry/exit, we ensure EL2 is not included when
> > counting guest events in hyp mode." ?
> > 
> 
> ... I don't think that really helps.  This comment is trying to
> translate code flow into English, which is generally a bad idea.  As I
> read the code, what this does is disabling counting at EL2 when
> is_kernel_in_hyp_mode() is true, which means you won't count in general
> in the host kernel, nor in the world-switch in KVM, which makes sense.

That's no problem, I'll drop this comment.

> 
> I recommend you drop the comment, and instead focus on explaining the
> overall semantics of the flags somewhere (perhaps my write-up below can
> be useful in that regard).

I'll find somewhere appropiate to add a description of what we do here.

> 
> > > 
> > > > +	 * with !exclude_host.
> > > >  	 */
> > > >  	if (is_kernel_in_hyp_mode()) {
> > > > -		if (!attr->exclude_kernel)
> > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > >  	} else {
> > > > -		if (attr->exclude_kernel)
> > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > >  		if (!attr->exclude_hv)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > >  	}
> > > > +
> > > > +	/*
> > > > +	 * Filter out !VHE kernels and guest kernels
> > > > +	 */
> > > > +	if (attr->exclude_kernel)
> > > > +		config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > > +
> > > 
> > > Let me see if I get this right:
> > > 
> > > 	exclude_user:	VHE: Don't count EL0
> > > 			Non-VHE: Don't count EL0
> > > 
> > > 	exclude_kernel: VHE: Don't count EL2 and don't count EL1
> > > 			Non-VHE: Don't count EL1
> > > 
> > > 	exclude_hv:	VHE: No effect
> > > 			Non-VHE: Don't count EL2
> > > 
> > > 	exclude_host:	VHE: Don't count EL2 + enable/disable on guest entry/exit
> > > 			Non-VHE: disable on guest entry/disable on guest entry/exit
> > > 
> > > And the logic I extract is that _user applies across both guest and
> > > host, as does _kernel (regardless of the mode the kernel on the current
> > > system runs in, might be only EL1, might be EL1 and EL2), and _hv is
> > > specific to non-VHE systems to measure events in a specific piece of KVM
> > > code that runs at EL2.
> > > 
> > > As I expressed before, that doesn't seem to be the intent behind the
> > > exclude_hv flag, but I'm not sure how other architectures actually
> > > implement things today, and even if it's a curiosity of the Arm
> > > architecture and has value to non-VHE hypervisor hackers, and we don't
> > > really have to care about uniformity with the other architectures, then
> > > fine.
> > > 
> > > It has taken me a while to make sense of this code change, so I really
> > > wish we can find a suitable place to document the semantics clearly for
> > > perf users on arm64.
> > > 
> > > Now, another thing comes to mind:  Do we really need to enable and
> > > disable anything on a VHE system on entry/exit to/from a guest?  Can we
> > > instead do the following:
> > > 
> > > 	exclude_host:	Disable EL2 counting
> > > 		     	Disable EL0 counting
> > > 		     	Enable EL0 counting on vcpu_load
> > > 		     	  (unless exclude_user is also set)
> > > 		     	Disable EL0 counting on vcpu_put
> > > 
> > > 	exclude_guest:	Disable EL1 counting
> > > 		      	Disable EL0 counting on vcpu_load
> > > 		      	Enable EL0 counting on vcpu_put
> > > 			  (unless exclude_user is also set)
> > > 
> > > If that works, we can avoid the overhead in the critical path on VHE
> > > systems and actually have slightly more accurate counting, leaving the
> > > entry/exit operations to be specific to non-VHE.
> > 
> > This all makes sense.
> > 
> > At present on VHE, for host only events, there is a small blackout window
> > at the guest entry/exit - this is where we turn off/on host counting before
> > entering/exiting the guest. (This blackout window also exists on !VHE unless
> > exclude_hv is set).
> > 
> > To mitigate against this the PMU switching code was brought as close to the
> > guest entry/exit as possible - but as you point out at this point in
> > kvm_arch_vcpu_ioctl_run we're running without interrupts/preemption and so
> > on the critical path. I believe we add about 11 instructions when there are
> > no PMU counters enabled and about 23 when they are enabled. I suppose it
> > would be possible to use static keys to reduce the overhead when counters
> > are not enabled...
> > 
> > Your suggestion provides an optimal solution, however it adds some
> > complexity - here is a rough attempt at implementing it...
> > 
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index cbfe3d1..bc548e6 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -482,10 +482,23 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> >  {
> >         return kvm_arch_vcpu_run_map_fp(vcpu);
> >  }
> > -static inline void kvm_set_pmu_events(u32 set, int flags)
> > +static inline bool kvm_pmu_switch_events(struct perf_event_attr *attr)
> > +{
> > +       if (!has_vhe())
> > +               return true;
> > +
> > +       if (attr->exclude_user)
> > +               return false;
> > +
> > +       return (attr->exclude_host ^ attr->exclude_guest);
> > +}
> > +static inline void kvm_set_pmu_events(u32 set, int flags, struct perf_event_attr *attr)
> >  {
> >         struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
> >  
> > +       if (!kvm_pmu_switch_events(attr))
> > +               return;
> > +
> >         if (flags & KVM_PMU_EVENTS_HOST)
> >                 ctx->pmu_events.events_host |= set;
> >         if (flags & KVM_PMU_EVENTS_GUEST)
> > @@ -499,6 +512,7 @@ static inline void kvm_clr_pmu_events(u32 clr)
> >         ctx->pmu_events.events_guest &= ~clr;
> >  }
> >  #else
> > +static inline bool kvm_pmu_switch_events() { return false; }
> >  static inline void kvm_set_pmu_events(u32 set, int flags) {}
> >  static inline void kvm_clr_pmu_events(u32 clr) {}
> >  #endif
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 21c6831..dae6691 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -542,10 +542,10 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> >         if (!attr->exclude_guest)
> >                 flags |= KVM_PMU_EVENTS_GUEST;
> >  
> > -       kvm_set_pmu_events(counter_bits, flags);
> > +       kvm_set_pmu_events(counter_bits, flags, attr);
> >  
> >         /* We rely on the hypervisor switch code to enable guest counters */
> > -       if (!attr->exclude_host) {
> > +       if (has_vhe() || !attr->exclude_host) {
> >                 armv8pmu_enable_counter(idx);
> >                 if (armv8pmu_event_is_chained(event))
> >                         armv8pmu_enable_counter(idx - 1);
> > @@ -572,7 +572,7 @@ static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> >         kvm_clr_pmu_events(counter_bits);
> >  
> >         /* We rely on the hypervisor switch code to disable guest counters */
> > -       if (!attr->exclude_host) {
> > +       if (has_vhe() || !attr->exclude_host) {
> >                 if (armv8pmu_event_is_chained(event))
> >                         armv8pmu_disable_counter(idx - 1);
> >                 armv8pmu_disable_counter(idx);
> > @@ -859,6 +859,10 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> >         if (is_kernel_in_hyp_mode()) {
> >                 if (!attr->exclude_kernel && !attr->exclude_host)
> >                         config_base |= ARMV8_PMU_INCLUDE_EL2;
> > +               if (attr->exclude_guest)
> > +                       config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > +               if (attr->exclude_host)
> > +                       config_base |= ARMV8_PMU_EXCLUDE_EL0;
> >         } else {
> >                 if (!attr->exclude_hv)
> >                         config_base |= ARMV8_PMU_INCLUDE_EL2;
> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index 9018fb3..722cd7a 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -512,15 +512,12 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
> >  {
> >         struct kvm_cpu_context *host_ctxt;
> >         struct kvm_cpu_context *guest_ctxt;
> > -       bool pmu_switch_needed;
> >         u64 exit_code;
> >  
> >         host_ctxt = vcpu->arch.host_cpu_context;
> >         host_ctxt->__hyp_running_vcpu = vcpu;
> >         guest_ctxt = &vcpu->arch.ctxt;
> >  
> > -       pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
> > -
> >         sysreg_save_host_state_vhe(host_ctxt);
> >  
> >         /*
> > @@ -562,9 +559,6 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
> >  
> >         __debug_switch_to_host(vcpu);
> >  
> > -       if (pmu_switch_needed)
> > -               __pmu_switch_to_host(host_ctxt);
> > -
> >         return exit_code;
> >  }
> >  
> > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > index 89acb7f..34d94ba 100644
> > --- a/virt/kvm/arm/arm.c
> > +++ b/virt/kvm/arm/arm.c
> > @@ -360,6 +360,42 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> >         return kvm_vgic_vcpu_init(vcpu);
> >  }
> >  
> > +static void kvm_vcpu_pmu_switch(struct kvm_vcpu *vcpu, bool guest)
> > +{
> > +       u32 typer, counter;
> > +       struct kvm_cpu_context *host_ctxt;
> > +       struct kvm_host_data *host;
> > +       unsigned long events_guest, events_host;
> > +
> > +       host_ctxt = vcpu->arch.host_cpu_context;
> > +       host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> > +       events_guest = host->pmu_events.events_guest;
> > +       events_host = host->pmu_events.events_host;
> > +
> > +       if (!has_vhe())
> > +               return;
> > +
> > +       for_each_set_bit(counter, &events_guest, 32) {
> > +               write_sysreg(counter, pmselr_el0);
> > +               isb();
> > +               if (guest)
> > +                       typer = read_sysreg(pmxevtyper_el0) & ~BIT(30);
> > +               else
> > +                       typer = read_sysreg(pmxevtyper_el0) | BIT(30);
> > +               write_sysreg(typer, pmxevtyper_el0);
> > +       }
> > +
> > +       for_each_set_bit(counter, &events_host, 32) {
> > +               write_sysreg(counter, pmselr_el0);
> > +               isb();
> > +               if (guest)
> > +                       typer = read_sysreg(pmxevtyper_el0) | BIT(30);
> > +               else
> > +                       typer = read_sysreg(pmxevtyper_el0) & ~BIT(30);
> > +               write_sysreg(typer, pmxevtyper_el0);
> > +       }
> > +}
> > +
> >  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >  {
> >         int *last_ran;
> > @@ -385,6 +421,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >         kvm_timer_vcpu_load(vcpu);
> >         kvm_vcpu_load_sysregs(vcpu);
> >         kvm_arch_vcpu_load_fp(vcpu);
> > +       kvm_vcpu_pmu_switch(vcpu, true);
> >  
> >         if (single_task_running())
> >                 vcpu_clear_wfe_traps(vcpu);
> > @@ -398,6 +435,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> >         kvm_vcpu_put_sysregs(vcpu);
> >         kvm_timer_vcpu_put(vcpu);
> >         kvm_vgic_put(vcpu);
> > +       kvm_vcpu_pmu_switch(vcpu, false);
> >  
> >         vcpu->cpu = -1;
> >  
> > 
> > Do you think this is worth developing further?
> > 
> 
> Yes, I think it's worth going to fairly great lengths to keep the
> critical path on VHE systems lean, and I suspect if we take the easy way
> to add functionality first, it will only be harder to factor things out
> later.
> 
> I'd like to see some (tested) version of this patch if possible.

OK I'll do this.

Thanks,

Andrew Murray

> 
> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-03-04 11:14         ` Andrew Murray
@ 2019-03-05 11:45           ` Andrew Murray
  2019-03-06  8:42             ` Christoffer Dall
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Murray @ 2019-03-05 11:45 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Mar 04, 2019 at 11:14:24AM +0000, Andrew Murray wrote:
> On Tue, Feb 26, 2019 at 01:44:59PM +0100, Christoffer Dall wrote:
> > On Wed, Feb 20, 2019 at 04:15:40PM +0000, Andrew Murray wrote:
> > > On Mon, Feb 18, 2019 at 10:53:07PM +0100, Christoffer Dall wrote:
> > > > On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> > > > > Add support for the :G and :H attributes in perf by handling the
> > > > > exclude_host/exclude_guest event attributes.
> > > > > 
> > > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > > as per the events exclude_host attribute.
> > > > > 
> > > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > > at EL2. We are able to eliminate counters counting host events on
> > > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > > on non-VHE then there is a small blackout window at the guest
> > > > > entry/exit where host events are not captured.
> > > > > 
> > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > > ---

> > > > 
> > > > Let me see if I get this right:
> > > > 
> > > > 	exclude_user:	VHE: Don't count EL0
> > > > 			Non-VHE: Don't count EL0
> > > > 
> > > > 	exclude_kernel: VHE: Don't count EL2 and don't count EL1
> > > > 			Non-VHE: Don't count EL1
> > > > 
> > > > 	exclude_hv:	VHE: No effect
> > > > 			Non-VHE: Don't count EL2
> > > > 
> > > > 	exclude_host:	VHE: Don't count EL2 + enable/disable on guest entry/exit
> > > > 			Non-VHE: disable on guest entry/disable on guest entry/exit
> > > > 
> > > > And the logic I extract is that _user applies across both guest and
> > > > host, as does _kernel (regardless of the mode the kernel on the current
> > > > system runs in, might be only EL1, might be EL1 and EL2), and _hv is
> > > > specific to non-VHE systems to measure events in a specific piece of KVM
> > > > code that runs at EL2.
> > > > 
> > > > As I expressed before, that doesn't seem to be the intent behind the
> > > > exclude_hv flag, but I'm not sure how other architectures actually
> > > > implement things today, and even if it's a curiosity of the Arm
> > > > architecture and has value to non-VHE hypervisor hackers, and we don't
> > > > really have to care about uniformity with the other architectures, then
> > > > fine.
> > > > 
> > > > It has taken me a while to make sense of this code change, so I really
> > > > wish we can find a suitable place to document the semantics clearly for
> > > > perf users on arm64.
> > > > 
> > > > Now, another thing comes to mind:  Do we really need to enable and
> > > > disable anything on a VHE system on entry/exit to/from a guest?  Can we
> > > > instead do the following:
> > > > 
> > > > 	exclude_host:	Disable EL2 counting
> > > > 		     	Disable EL0 counting
> > > > 		     	Enable EL0 counting on vcpu_load
> > > > 		     	  (unless exclude_user is also set)
> > > > 		     	Disable EL0 counting on vcpu_put
> > > > 
> > > > 	exclude_guest:	Disable EL1 counting
> > > > 		      	Disable EL0 counting on vcpu_load
> > > > 		      	Enable EL0 counting on vcpu_put
> > > > 			  (unless exclude_user is also set)
> > > > 
> > > > If that works, we can avoid the overhead in the critical path on VHE
> > > > systems and actually have slightly more accurate counting, leaving the
> > > > entry/exit operations to be specific to non-VHE.
> > > 
> > > This all makes sense.
> > > 
> > > At present on VHE, for host only events, there is a small blackout window
> > > at the guest entry/exit - this is where we turn off/on host counting before
> > > entering/exiting the guest. (This blackout window also exists on !VHE unless
> > > exclude_hv is set).
> > > 
> > > To mitigate against this the PMU switching code was brought as close to the
> > > guest entry/exit as possible - but as you point out at this point in
> > > kvm_arch_vcpu_ioctl_run we're running without interrupts/preemption and so
> > > on the critical path. I believe we add about 11 instructions when there are
> > > no PMU counters enabled and about 23 when they are enabled. I suppose it
> > > would be possible to use static keys to reduce the overhead when counters
> > > are not enabled...
> > > 
> > > Your suggestion provides an optimal solution, however it adds some
> > > complexity - here is a rough attempt at implementing it...
> > > 

> > > 
> > > Do you think this is worth developing further?
> > > 
> > 
> > Yes, I think it's worth going to fairly great lengths to keep the
> > critical path on VHE systems lean, and I suspect if we take the easy way
> > to add functionality first, it will only be harder to factor things out
> > later.

It looks like we can also apply this same approach on !VHE. For the sole
use-case where a user wishes to count only for a host or guest EL0, then we
switch the counter at EL1 in vcpu_load instead of the switch code.

Thanks,

Andrew Murray

> > 
> > I'd like to see some (tested) version of this patch if possible.
> 
> OK I'll do this.
> 
> Thanks,
> 
> Andrew Murray
> 
> > 
> > 
> > Thanks,
> > 
> >     Christoffer
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-03-05 11:45           ` Andrew Murray
@ 2019-03-06  8:42             ` Christoffer Dall
  0 siblings, 0 replies; 15+ messages in thread
From: Christoffer Dall @ 2019-03-06  8:42 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Mar 05, 2019 at 11:45:52AM +0000, Andrew Murray wrote:
> On Mon, Mar 04, 2019 at 11:14:24AM +0000, Andrew Murray wrote:
> > On Tue, Feb 26, 2019 at 01:44:59PM +0100, Christoffer Dall wrote:
> > > On Wed, Feb 20, 2019 at 04:15:40PM +0000, Andrew Murray wrote:
> > > > On Mon, Feb 18, 2019 at 10:53:07PM +0100, Christoffer Dall wrote:
> > > > > On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> > > > > > Add support for the :G and :H attributes in perf by handling the
> > > > > > exclude_host/exclude_guest event attributes.
> > > > > > 
> > > > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > > > as per the events exclude_host attribute.
> > > > > > 
> > > > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > > > at EL2. We are able to eliminate counters counting host events on
> > > > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > > > on non-VHE then there is a small blackout window at the guest
> > > > > > entry/exit where host events are not captured.
> > > > > > 
> > > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > > > ---
> 
> > > > > 
> > > > > Let me see if I get this right:
> > > > > 
> > > > > 	exclude_user:	VHE: Don't count EL0
> > > > > 			Non-VHE: Don't count EL0
> > > > > 
> > > > > 	exclude_kernel: VHE: Don't count EL2 and don't count EL1
> > > > > 			Non-VHE: Don't count EL1
> > > > > 
> > > > > 	exclude_hv:	VHE: No effect
> > > > > 			Non-VHE: Don't count EL2
> > > > > 
> > > > > 	exclude_host:	VHE: Don't count EL2 + enable/disable on guest entry/exit
> > > > > 			Non-VHE: disable on guest entry/disable on guest entry/exit
> > > > > 
> > > > > And the logic I extract is that _user applies across both guest and
> > > > > host, as does _kernel (regardless of the mode the kernel on the current
> > > > > system runs in, might be only EL1, might be EL1 and EL2), and _hv is
> > > > > specific to non-VHE systems to measure events in a specific piece of KVM
> > > > > code that runs at EL2.
> > > > > 
> > > > > As I expressed before, that doesn't seem to be the intent behind the
> > > > > exclude_hv flag, but I'm not sure how other architectures actually
> > > > > implement things today, and even if it's a curiosity of the Arm
> > > > > architecture and has value to non-VHE hypervisor hackers, and we don't
> > > > > really have to care about uniformity with the other architectures, then
> > > > > fine.
> > > > > 
> > > > > It has taken me a while to make sense of this code change, so I really
> > > > > wish we can find a suitable place to document the semantics clearly for
> > > > > perf users on arm64.
> > > > > 
> > > > > Now, another thing comes to mind:  Do we really need to enable and
> > > > > disable anything on a VHE system on entry/exit to/from a guest?  Can we
> > > > > instead do the following:
> > > > > 
> > > > > 	exclude_host:	Disable EL2 counting
> > > > > 		     	Disable EL0 counting
> > > > > 		     	Enable EL0 counting on vcpu_load
> > > > > 		     	  (unless exclude_user is also set)
> > > > > 		     	Disable EL0 counting on vcpu_put
> > > > > 
> > > > > 	exclude_guest:	Disable EL1 counting
> > > > > 		      	Disable EL0 counting on vcpu_load
> > > > > 		      	Enable EL0 counting on vcpu_put
> > > > > 			  (unless exclude_user is also set)
> > > > > 
> > > > > If that works, we can avoid the overhead in the critical path on VHE
> > > > > systems and actually have slightly more accurate counting, leaving the
> > > > > entry/exit operations to be specific to non-VHE.
> > > > 
> > > > This all makes sense.
> > > > 
> > > > At present on VHE, for host only events, there is a small blackout window
> > > > at the guest entry/exit - this is where we turn off/on host counting before
> > > > entering/exiting the guest. (This blackout window also exists on !VHE unless
> > > > exclude_hv is set).
> > > > 
> > > > To mitigate against this the PMU switching code was brought as close to the
> > > > guest entry/exit as possible - but as you point out at this point in
> > > > kvm_arch_vcpu_ioctl_run we're running without interrupts/preemption and so
> > > > on the critical path. I believe we add about 11 instructions when there are
> > > > no PMU counters enabled and about 23 when they are enabled. I suppose it
> > > > would be possible to use static keys to reduce the overhead when counters
> > > > are not enabled...
> > > > 
> > > > Your suggestion provides an optimal solution, however it adds some
> > > > complexity - here is a rough attempt at implementing it...
> > > > 
> 
> > > > 
> > > > Do you think this is worth developing further?
> > > > 
> > > 
> > > Yes, I think it's worth going to fairly great lengths to keep the
> > > critical path on VHE systems lean, and I suspect if we take the easy way
> > > to add functionality first, it will only be harder to factor things out
> > > later.
> 
> It looks like we can also apply this same approach on !VHE. For the sole
> use-case where a user wishes to count only for a host or guest EL0, then we
> switch the counter at EL1 in vcpu_load instead of the switch code.
> 

ok, even better then.  Looking forward to seeing the patches.

Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-03-06  8:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-14 16:11 [PATCH v10 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
2019-01-14 16:11 ` [PATCH v10 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
2019-01-14 16:11 ` [PATCH v10 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
2019-01-14 16:11 ` [PATCH v10 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
2019-01-14 16:11 ` [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
2019-02-11 11:26   ` Will Deacon
2019-02-18 21:53   ` Christoffer Dall
2019-02-20 16:15     ` Andrew Murray
2019-02-26 12:44       ` Christoffer Dall
2019-03-04 11:14         ` Andrew Murray
2019-03-05 11:45           ` Andrew Murray
2019-03-06  8:42             ` Christoffer Dall
2019-01-14 16:11 ` [PATCH v10 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
2019-02-18 22:00   ` Christoffer Dall
2019-03-04  9:40     ` Andrew Murray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).