linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H
@ 2018-12-12 10:29 Andrew Murray
  2018-12-12 10:29 ` [PATCH v8 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 10:29 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Suzuki K Poulose, kvmarm, linux-arm-kernel, Julien Thierry

This patchset provides support for perf event modifiers :G and :H which
allows for filtering of PMU events between host and guests when used
with KVM.

As the underlying hardware cannot distinguish between guest and host
context, the performance counters must be stopped and started upon
entry/exit to the guest. This is performed at EL2 in a way that
minimizes overhead and improves accuracy of recording events that only
occur in the requested context.

This has been tested with VHE and non-VHE kernels with a KVM guest.

Changes from v7:

 - Added additional patch to encapsulate kvm_cpu_context in kvm_host_data

Changes from v6:

 - Move events_host/events_guest out of kvm_cpu_context

Changes from v5:

 - Tweak logic in use of kvm_set_pmu_events

Changes from v4:

 - Prevent unnecessary write_sysreg calls by improving
   __pmu_switch_to_xxx logic.

Changes from v3:

 - Remove confusing _only suffix from bitfields in kvm_cpu_context
 - Remove unnecessary condition when clearing event bits in disable
 - Simplify API of KVM accessors
 - Prevent unnecessary setting of pmcnten when guest/host events are
   the same.

Changes from v2:

 - Ensured that exclude_kernel works for guest
 - Removed unnecessary exclusion of EL2 with exclude_host on !VHE
 - Renamed kvm_clr_set_host_pmu_events to reflect args order
 - Added additional information to isb patch

Changes from v1:

 - Removed unnecessary exclusion of EL1 with exclude_guest on VHE
 - Removed unnecessary isb from existing perf_event.c driver
 - Folded perf_event.c patches together
 - Added additional information to last patch commit message

Andrew Murray (5):
  arm64: arm_pmu: remove unnecessary isb instruction
  arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  arm64: KVM: add accessors to track guest/host only counters
  arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  arm64: KVM: Enable support for :G/:H perf event modifiers

 arch/arm/include/asm/kvm_host.h   |  8 ++++--
 arch/arm64/include/asm/kvm_asm.h  |  4 +--
 arch/arm64/include/asm/kvm_host.h | 42 +++++++++++++++++++++++++++----
 arch/arm64/kernel/asm-offsets.c   |  2 +-
 arch/arm64/kernel/perf_event.c    | 52 +++++++++++++++++++++++++++++++++------
 arch/arm64/kvm/hyp/switch.c       | 52 +++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/arm.c                | 10 +++++---
 7 files changed, 148 insertions(+), 22 deletions(-)

-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v8 1/5] arm64: arm_pmu: remove unnecessary isb instruction
  2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
@ 2018-12-12 10:29 ` Andrew Murray
  2018-12-12 10:29 ` [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 10:29 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Suzuki K Poulose, kvmarm, linux-arm-kernel, Julien Thierry

The armv8pmu_enable_event_counter function issues an isb instruction
after enabling a pair of counters - this doesn't provide any value
and is inconsistent with the armv8pmu_disable_event_counter.

In any case armv8pmu_enable_event_counter is always called with the
PMU stopped. Starting the PMU with armv8pmu_start results in an isb
instruction being issued prior to writing to PMCR_EL0.

Let's remove the unnecessary isb instruction.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/kernel/perf_event.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 8e38d52..de564ae 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -652,7 +652,6 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
 	armv8pmu_enable_counter(idx);
 	if (armv8pmu_event_is_chained(event))
 		armv8pmu_enable_counter(idx - 1);
-	isb();
 }
 
 static inline int armv8pmu_disable_counter(int idx)
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
  2018-12-12 10:29 ` [PATCH v8 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
@ 2018-12-12 10:29 ` Andrew Murray
  2018-12-12 10:37   ` Suzuki K Poulose
  2018-12-18 11:40   ` Christoffer Dall
  2018-12-12 10:29 ` [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 10:29 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Suzuki K Poulose, kvmarm, linux-arm-kernel, Julien Thierry

The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
a typedef to kvm_cpu_context and is used to store host cpu context. The
kvm_cpu_context structure is also used elsewhere to hold vcpu context.
In order to use the percpu to hold additional future host information we
encapsulate kvm_cpu_context in a new structure and rename the typedef and
percpu to match.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |  8 ++++++--
 arch/arm64/include/asm/kvm_asm.h  |  4 ++--
 arch/arm64/include/asm/kvm_host.h | 15 ++++++++++-----
 arch/arm64/kernel/asm-offsets.c   |  2 +-
 virt/kvm/arm/arm.c                | 10 ++++++----
 5 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 79906ce..71645ba 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -145,7 +145,11 @@ struct kvm_cpu_context {
 	u32 cp15[NR_CP15_REGS];
 };
 
-typedef struct kvm_cpu_context kvm_cpu_context_t;
+struct kvm_host_data {
+	struct kvm_cpu_context host_ctxt;
+};
+
+typedef struct kvm_host_data kvm_host_data_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
@@ -163,7 +167,7 @@ struct kvm_vcpu_arch {
 	struct kvm_vcpu_fault_info fault;
 
 	/* Host FP context */
-	kvm_cpu_context_t *host_cpu_context;
+	struct kvm_cpu_context *host_cpu_context;
 
 	/* VGIC state */
 	struct vgic_cpu vgic_cpu;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 102b5a5..6a9bfd4 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -102,12 +102,12 @@ extern u32 __init_stage2_translation(void);
 .endm
 
 .macro get_host_ctxt reg, tmp
-	hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp
+	hyp_adr_this_cpu \reg, kvm_host_data, \tmp
 .endm
 
 .macro get_vcpu_ptr vcpu, ctxt
 	get_host_ctxt \ctxt, \vcpu
-	ldr	\vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
+	ldr	\vcpu, [\ctxt, #HOST_DATA_VCPU]
 	kern_hyp_va	\vcpu
 .endm
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 1550192..1d3ca91 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -205,7 +205,12 @@ struct kvm_cpu_context {
 	struct kvm_vcpu *__hyp_running_vcpu;
 };
 
-typedef struct kvm_cpu_context kvm_cpu_context_t;
+struct kvm_host_data {
+	struct kvm_cpu_context host_ctxt;
+	struct kvm_pmu_events pmu_events;
+};
+
+typedef struct kvm_host_data kvm_host_data_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
@@ -241,7 +246,7 @@ struct kvm_vcpu_arch {
 	struct kvm_guest_debug_arch external_debug_state;
 
 	/* Pointer to host CPU context */
-	kvm_cpu_context_t *host_cpu_context;
+	struct kvm_cpu_context *host_cpu_context;
 
 	struct thread_info *host_thread_info;	/* hyp VA */
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
@@ -387,7 +392,7 @@ void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
+DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
 
 void __kvm_enable_ssbs(void);
 
@@ -400,8 +405,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 	 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 	 * so that we can use adr_l to access per-cpu variables in EL2.
 	 */
-	u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_cpu_state) -
-			 (u64)kvm_ksym_ref(kvm_host_cpu_state));
+	u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_data) -
+			 (u64)kvm_ksym_ref(kvm_host_data));
 
 	/*
 	 * Call initialization code, and switch to the full blown HYP code.
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 323aeb5..cb968ff 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -142,7 +142,7 @@ int main(void)
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
   DEFINE(VCPU_FPEXC32_EL2,	offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2]));
   DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));
-  DEFINE(HOST_CONTEXT_VCPU,	offsetof(struct kvm_cpu_context, __hyp_running_vcpu));
+  DEFINE(HOST_DATA_VCPU,	offsetof(struct kvm_host_data, host_ctxt.__hyp_running_vcpu));
 #endif
 #ifdef CONFIG_CPU_PM
   DEFINE(CPU_SUSPEND_SZ,	sizeof(struct cpu_suspend_ctx));
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 150c8a6..c031ddf 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -56,7 +56,7 @@
 __asm__(".arch_extension	virt");
 #endif
 
-DEFINE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
+DEFINE_PER_CPU(kvm_host_data_t, kvm_host_data);
 static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
 
 /* Per-CPU variable containing the currently running vcpu. */
@@ -361,8 +361,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
 	int *last_ran;
+	kvm_host_data_t *cpu_ctxt;
 
 	last_ran = this_cpu_ptr(vcpu->kvm->arch.last_vcpu_ran);
+	cpu_ctxt = this_cpu_ptr(&kvm_host_data);
 
 	/*
 	 * We might get preempted before the vCPU actually runs, but
@@ -374,7 +376,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	}
 
 	vcpu->cpu = cpu;
-	vcpu->arch.host_cpu_context = this_cpu_ptr(&kvm_host_cpu_state);
+	vcpu->arch.host_cpu_context = &cpu_ctxt->host_ctxt;
 
 	kvm_arm_set_running_vcpu(vcpu);
 	kvm_vgic_load(vcpu);
@@ -1547,9 +1549,9 @@ static int init_hyp_mode(void)
 	}
 
 	for_each_possible_cpu(cpu) {
-		kvm_cpu_context_t *cpu_ctxt;
+		kvm_host_data_t *cpu_ctxt;
 
-		cpu_ctxt = per_cpu_ptr(&kvm_host_cpu_state, cpu);
+		cpu_ctxt = per_cpu_ptr(&kvm_host_data, cpu);
 		err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1, PAGE_HYP);
 
 		if (err) {
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters
  2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
  2018-12-12 10:29 ` [PATCH v8 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
  2018-12-12 10:29 ` [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
@ 2018-12-12 10:29 ` Andrew Murray
  2018-12-12 10:56   ` Suzuki K Poulose
  2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
  2018-12-12 10:29 ` [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
  4 siblings, 1 reply; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 10:29 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Suzuki K Poulose, kvmarm, linux-arm-kernel, Julien Thierry

In order to effeciently enable/disable guest/host only perf counters
at guest entry/exit we add bitfields to kvm_cpu_context for guest and
host events as well as accessors for updating them.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 1d3ca91..50f9d4c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -205,6 +205,11 @@ struct kvm_cpu_context {
 	struct kvm_vcpu *__hyp_running_vcpu;
 };
 
+struct kvm_pmu_events {
+	u32 events_host;
+	u32 events_guest;
+};
+
 struct kvm_host_data {
 	struct kvm_cpu_context host_ctxt;
 	struct kvm_pmu_events pmu_events;
@@ -472,11 +477,33 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
 
+#define KVM_PMU_EVENTS_HOST	1
+#define KVM_PMU_EVENTS_GUEST	2
+
 #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
 static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 {
 	return kvm_arch_vcpu_run_map_fp(vcpu);
 }
+static inline void kvm_set_pmu_events(u32 set, int flags)
+{
+	struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+
+	if (flags & KVM_PMU_EVENTS_HOST)
+		ctx->pmu_events.events_host |= set;
+	if (flags & KVM_PMU_EVENTS_GUEST)
+		ctx->pmu_events.events_guest |= set;
+}
+static inline void kvm_clr_pmu_events(u32 clr)
+{
+	struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+
+	ctx->pmu_events.events_host &= ~clr;
+	ctx->pmu_events.events_guest &= ~clr;
+}
+#else
+static inline void kvm_set_pmu_events(u32 set, int flags) {}
+static inline void kvm_clr_pmu_events(u32 clr) {}
 #endif
 
 static inline void kvm_arm_vhe_guest_enter(void)
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
                   ` (2 preceding siblings ...)
  2018-12-12 10:29 ` [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
@ 2018-12-12 10:29 ` Andrew Murray
  2018-12-12 10:42   ` Suzuki K Poulose
                     ` (2 more replies)
  2018-12-12 10:29 ` [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
  4 siblings, 3 replies; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 10:29 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Suzuki K Poulose, kvmarm, linux-arm-kernel, Julien Thierry

Add support for the :G and :H attributes in perf by handling the
exclude_host/exclude_guest event attributes.

We notify KVM of counters that we wish to be enabled or disabled on
guest entry/exit and thus defer from starting or stopping :G events
as per the events exclude_host attribute.

With both VHE and non-VHE we switch the counters between host/guest
at EL2. We are able to eliminate counters counting host events on
the boundaries of guest entry/exit when using :G by filtering out
EL2 for exclude_host. However when using :H unless exclude_hv is set
on non-VHE then there is a small blackout window at the guest
entry/exit where host events are not captured.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index de564ae..4a3c73d 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -26,6 +26,7 @@
 
 #include <linux/acpi.h>
 #include <linux/clocksource.h>
+#include <linux/kvm_host.h>
 #include <linux/of.h>
 #include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
@@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
 
 static inline void armv8pmu_enable_event_counter(struct perf_event *event)
 {
+	struct perf_event_attr *attr = &event->attr;
 	int idx = event->hw.idx;
+	int flags = 0;
+	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
-	armv8pmu_enable_counter(idx);
 	if (armv8pmu_event_is_chained(event))
-		armv8pmu_enable_counter(idx - 1);
+		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+	if (!attr->exclude_host)
+		flags |= KVM_PMU_EVENTS_HOST;
+	if (!attr->exclude_guest)
+		flags |= KVM_PMU_EVENTS_GUEST;
+
+	kvm_set_pmu_events(counter_bits, flags);
+
+	if (!attr->exclude_host) {
+		armv8pmu_enable_counter(idx);
+		if (armv8pmu_event_is_chained(event))
+			armv8pmu_enable_counter(idx - 1);
+	}
 }
 
 static inline int armv8pmu_disable_counter(int idx)
@@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
 static inline void armv8pmu_disable_event_counter(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event_attr *attr = &event->attr;
 	int idx = hwc->idx;
+	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
 	if (armv8pmu_event_is_chained(event))
-		armv8pmu_disable_counter(idx - 1);
-	armv8pmu_disable_counter(idx);
+		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+	kvm_clr_pmu_events(counter_bits);
+
+	if (!attr->exclude_host) {
+		if (armv8pmu_event_is_chained(event))
+			armv8pmu_disable_counter(idx - 1);
+		armv8pmu_disable_counter(idx);
+	}
 }
 
 static inline int armv8pmu_enable_intens(int idx)
@@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
 	 * Therefore we ignore exclude_hv in this configuration, since
 	 * there's no hypervisor to sample anyway. This is consistent
 	 * with other architectures (x86 and Power).
+	 *
+	 * To eliminate counting host events on the boundaries of
+	 * guest entry/exit we ensure EL2 is not included in hyp mode
+	 * with !exclude_host.
 	 */
 	if (is_kernel_in_hyp_mode()) {
-		if (!attr->exclude_kernel)
+		if (!attr->exclude_kernel && !attr->exclude_host)
 			config_base |= ARMV8_PMU_INCLUDE_EL2;
 	} else {
-		if (attr->exclude_kernel)
-			config_base |= ARMV8_PMU_EXCLUDE_EL1;
 		if (!attr->exclude_hv)
 			config_base |= ARMV8_PMU_INCLUDE_EL2;
 	}
+
+	/*
+	 * Filter out !VHE kernels and guest kernels
+	 */
+	if (attr->exclude_kernel)
+		config_base |= ARMV8_PMU_EXCLUDE_EL1;
+
 	if (attr->exclude_user)
 		config_base |= ARMV8_PMU_EXCLUDE_EL0;
 
@@ -976,6 +1010,9 @@ static void armv8pmu_reset(void *info)
 		armv8pmu_disable_intens(idx);
 	}
 
+	/* Clear the counters we flip at guest entry/exit */
+	kvm_clr_pmu_events(U32_MAX);
+
 	/*
 	 * Initialize & Reset PMNC. Request overflow interrupt for
 	 * 64 bit cycle counter but cheat in armv8pmu_write_counter().
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers
  2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
                   ` (3 preceding siblings ...)
  2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
@ 2018-12-12 10:29 ` Andrew Murray
  2018-12-12 10:53   ` Suzuki K Poulose
  4 siblings, 1 reply; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 10:29 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	Mark Rutland
  Cc: Suzuki K Poulose, kvmarm, linux-arm-kernel, Julien Thierry

Enable/disable event counters as appropriate when entering and exiting
the guest to enable support for guest or host only event counting.

For both VHE and non-VHE we switch the counters between host/guest at
EL2. EL2 is filtered out by the PMU when we are using the :G modifier.

The PMU may be on when we change which counters are enabled however
we avoid adding an isb as we instead rely on existing context
synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and
the eret from the hvc in kvm_call_hyp.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 52 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d496ef5..9732ef7 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -373,6 +373,46 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
+{
+	struct kvm_host_data *host;
+	struct kvm_pmu_events *pmu;
+	u32 clr, set;
+
+	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
+	pmu = &host->pmu_events;
+
+	clr = pmu->events_host & ~pmu->events_guest;
+	set = pmu->events_guest & ~pmu->events_host;
+
+	if (clr)
+		write_sysreg(clr, pmcntenclr_el0);
+
+	if (set)
+		write_sysreg(set, pmcntenset_el0);
+
+	return (clr || set);
+}
+
+static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
+{
+	struct kvm_host_data *host;
+	struct kvm_pmu_events *pmu;
+	u32 clr, set;
+
+	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
+	pmu = &host->pmu_events;
+
+	clr = pmu->events_guest & ~pmu->events_host;
+	set = pmu->events_host & ~pmu->events_guest;
+
+	if (clr)
+		write_sysreg(clr, pmcntenclr_el0);
+
+	if (set)
+		write_sysreg(set, pmcntenset_el0);
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -488,12 +528,15 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
+	bool pmu_switch_needed;
 	u64 exit_code;
 
 	host_ctxt = vcpu->arch.host_cpu_context;
 	host_ctxt->__hyp_running_vcpu = vcpu;
 	guest_ctxt = &vcpu->arch.ctxt;
 
+	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
+
 	sysreg_save_host_state_vhe(host_ctxt);
 
 	__activate_traps(vcpu);
@@ -524,6 +567,9 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 
 	__debug_switch_to_host(vcpu);
 
+	if (pmu_switch_needed)
+		__pmu_switch_to_host(host_ctxt);
+
 	return exit_code;
 }
 
@@ -532,6 +578,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
+	bool pmu_switch_needed;
 	u64 exit_code;
 
 	vcpu = kern_hyp_va(vcpu);
@@ -540,6 +587,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 	host_ctxt->__hyp_running_vcpu = vcpu;
 	guest_ctxt = &vcpu->arch.ctxt;
 
+	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
+
 	__sysreg_save_state_nvhe(host_ctxt);
 
 	__activate_traps(vcpu);
@@ -586,6 +635,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 	 */
 	__debug_switch_to_host(vcpu);
 
+	if (pmu_switch_needed)
+		__pmu_switch_to_host(host_ctxt);
+
 	return exit_code;
 }
 
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  2018-12-12 10:29 ` [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
@ 2018-12-12 10:37   ` Suzuki K Poulose
  2018-12-12 12:31     ` Andrew Murray
  2018-12-18 11:40   ` Christoffer Dall
  1 sibling, 1 reply; 32+ messages in thread
From: Suzuki K Poulose @ 2018-12-12 10:37 UTC (permalink / raw)
  To: Andrew Murray, Christoffer Dall, Marc Zyngier, Catalin Marinas,
	Will Deacon, Mark Rutland
  Cc: kvmarm, linux-arm-kernel, Julien Thierry

Hi Andrew,

On 12/12/2018 10:29, Andrew Murray wrote:
> The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
> a typedef to kvm_cpu_context and is used to store host cpu context. The
> kvm_cpu_context structure is also used elsewhere to hold vcpu context.
> In order to use the percpu to hold additional future host information we
> encapsulate kvm_cpu_context in a new structure and rename the typedef and
> percpu to match.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>   arch/arm/include/asm/kvm_host.h   |  8 ++++++--
>   arch/arm64/include/asm/kvm_asm.h  |  4 ++--
>   arch/arm64/include/asm/kvm_host.h | 15 ++++++++++-----
>   arch/arm64/kernel/asm-offsets.c   |  2 +-
>   virt/kvm/arm/arm.c                | 10 ++++++----
>   5 files changed, 25 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 79906ce..71645ba 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -145,7 +145,11 @@ struct kvm_cpu_context {
>   	u32 cp15[NR_CP15_REGS];
>   };
>   
> -typedef struct kvm_cpu_context kvm_cpu_context_t;
> +struct kvm_host_data {
> +	struct kvm_cpu_context host_ctxt;
> +};
> +
> +typedef struct kvm_host_data kvm_host_data_t;
>   
>   struct kvm_vcpu_arch {
>   	struct kvm_cpu_context ctxt;
> @@ -163,7 +167,7 @@ struct kvm_vcpu_arch {
>   	struct kvm_vcpu_fault_info fault;
>   
>   	/* Host FP context */
> -	kvm_cpu_context_t *host_cpu_context;
> +	struct kvm_cpu_context *host_cpu_context;
>   
>   	/* VGIC state */
>   	struct vgic_cpu vgic_cpu;
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 102b5a5..6a9bfd4 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -102,12 +102,12 @@ extern u32 __init_stage2_translation(void);
>   .endm
>   
>   .macro get_host_ctxt reg, tmp
> -	hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp
> +	hyp_adr_this_cpu \reg, kvm_host_data, \tmp
>   .endm
>   
>   .macro get_vcpu_ptr vcpu, ctxt
>   	get_host_ctxt \ctxt, \vcpu
> -	ldr	\vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
> +	ldr	\vcpu, [\ctxt, #HOST_DATA_VCPU]
>   	kern_hyp_va	\vcpu
>   .endm
>   
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 1550192..1d3ca91 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -205,7 +205,12 @@ struct kvm_cpu_context {
>   	struct kvm_vcpu *__hyp_running_vcpu;
>   };
>   
> -typedef struct kvm_cpu_context kvm_cpu_context_t;
> +struct kvm_host_data {
> +	struct kvm_cpu_context host_ctxt;
> +	struct kvm_pmu_events pmu_events;
> +};
> +

Unfortunately this would fail to build as we don't have the kvm_pmu_events
defined in this patch and will break the git bisects.

Otherwise looks good to me.


Cheers
Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
@ 2018-12-12 10:42   ` Suzuki K Poulose
  2018-12-12 10:47     ` Julien Thierry
  2018-12-12 10:55   ` Suzuki K Poulose
  2018-12-18 12:02   ` Christoffer Dall
  2 siblings, 1 reply; 32+ messages in thread
From: Suzuki K Poulose @ 2018-12-12 10:42 UTC (permalink / raw)
  To: Andrew Murray, Christoffer Dall, Marc Zyngier, Catalin Marinas,
	Will Deacon, Mark Rutland
  Cc: kvmarm, linux-arm-kernel, Julien Thierry



On 12/12/2018 10:29, Andrew Murray wrote:
> Add support for the :G and :H attributes in perf by handling the
> exclude_host/exclude_guest event attributes.
> 
> We notify KVM of counters that we wish to be enabled or disabled on
> guest entry/exit and thus defer from starting or stopping :G events
> as per the events exclude_host attribute.
> 
> With both VHE and non-VHE we switch the counters between host/guest
> at EL2. We are able to eliminate counters counting host events on
> the boundaries of guest entry/exit when using :G by filtering out
> EL2 for exclude_host. However when using :H unless exclude_hv is set
> on non-VHE then there is a small blackout window at the guest
> entry/exit where host events are not captured.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>   arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
>   1 file changed, 44 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index de564ae..4a3c73d 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -26,6 +26,7 @@
>   
>   #include <linux/acpi.h>
>   #include <linux/clocksource.h>
> +#include <linux/kvm_host.h>
>   #include <linux/of.h>
>   #include <linux/perf/arm_pmu.h>
>   #include <linux/platform_device.h>
> @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
>   
>   static inline void armv8pmu_enable_event_counter(struct perf_event *event)
>   {
> +	struct perf_event_attr *attr = &event->attr;
>   	int idx = event->hw.idx;
> +	int flags = 0;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>   
> -	armv8pmu_enable_counter(idx);
>   	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_enable_counter(idx - 1);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	if (!attr->exclude_host)
> +		flags |= KVM_PMU_EVENTS_HOST;
> +	if (!attr->exclude_guest)
> +		flags |= KVM_PMU_EVENTS_GUEST;
> +
> +	kvm_set_pmu_events(counter_bits, flags);
> +
> +	if (!attr->exclude_host) {
> +		armv8pmu_enable_counter(idx);
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_enable_counter(idx - 1);
> +	}
>   }
>   
>   static inline int armv8pmu_disable_counter(int idx)
> @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
>   static inline void armv8pmu_disable_event_counter(struct perf_event *event)
>   {
>   	struct hw_perf_event *hwc = &event->hw;
> +	struct perf_event_attr *attr = &event->attr;
>   	int idx = hwc->idx;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>   
>   	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_disable_counter(idx - 1);
> -	armv8pmu_disable_counter(idx);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	kvm_clr_pmu_events(counter_bits);
> +
> +	if (!attr->exclude_host) {
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_disable_counter(idx - 1);
> +		armv8pmu_disable_counter(idx);
> +	}

Shouldn't we disable the events, irrespective of whether it is for host or/and 
guest ? Otherwise looks good to me.

Cheers
Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-12 10:42   ` Suzuki K Poulose
@ 2018-12-12 10:47     ` Julien Thierry
  2018-12-12 10:51       ` Suzuki K Poulose
  0 siblings, 1 reply; 32+ messages in thread
From: Julien Thierry @ 2018-12-12 10:47 UTC (permalink / raw)
  To: Suzuki K Poulose, Andrew Murray, Christoffer Dall, Marc Zyngier,
	Catalin Marinas, Will Deacon, Mark Rutland
  Cc: kvmarm, linux-arm-kernel



On 12/12/2018 10:42, Suzuki K Poulose wrote:
> 
> 
> On 12/12/2018 10:29, Andrew Murray wrote:
>> Add support for the :G and :H attributes in perf by handling the
>> exclude_host/exclude_guest event attributes.
>>
>> We notify KVM of counters that we wish to be enabled or disabled on
>> guest entry/exit and thus defer from starting or stopping :G events
>> as per the events exclude_host attribute.
>>
>> With both VHE and non-VHE we switch the counters between host/guest
>> at EL2. We are able to eliminate counters counting host events on
>> the boundaries of guest entry/exit when using :G by filtering out
>> EL2 for exclude_host. However when using :H unless exclude_hv is set
>> on non-VHE then there is a small blackout window at the guest
>> entry/exit where host events are not captured.
>>
>> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
>> ---
>>   arch/arm64/kernel/perf_event.c | 51
>> ++++++++++++++++++++++++++++++++++++------
>>   1 file changed, 44 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/perf_event.c
>> b/arch/arm64/kernel/perf_event.c
>> index de564ae..4a3c73d 100644
>> --- a/arch/arm64/kernel/perf_event.c
>> +++ b/arch/arm64/kernel/perf_event.c
>> @@ -26,6 +26,7 @@
>>     #include <linux/acpi.h>
>>   #include <linux/clocksource.h>
>> +#include <linux/kvm_host.h>
>>   #include <linux/of.h>
>>   #include <linux/perf/arm_pmu.h>
>>   #include <linux/platform_device.h>
>> @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
>>     static inline void armv8pmu_enable_event_counter(struct perf_event
>> *event)
>>   {
>> +    struct perf_event_attr *attr = &event->attr;
>>       int idx = event->hw.idx;
>> +    int flags = 0;
>> +    u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>>   -    armv8pmu_enable_counter(idx);
>>       if (armv8pmu_event_is_chained(event))
>> -        armv8pmu_enable_counter(idx - 1);
>> +        counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
>> +
>> +    if (!attr->exclude_host)
>> +        flags |= KVM_PMU_EVENTS_HOST;
>> +    if (!attr->exclude_guest)
>> +        flags |= KVM_PMU_EVENTS_GUEST;
>> +
>> +    kvm_set_pmu_events(counter_bits, flags);
>> +
>> +    if (!attr->exclude_host) {
>> +        armv8pmu_enable_counter(idx);
>> +        if (armv8pmu_event_is_chained(event))
>> +            armv8pmu_enable_counter(idx - 1);
>> +    }
>>   }
>>     static inline int armv8pmu_disable_counter(int idx)
>> @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
>>   static inline void armv8pmu_disable_event_counter(struct perf_event
>> *event)
>>   {
>>       struct hw_perf_event *hwc = &event->hw;
>> +    struct perf_event_attr *attr = &event->attr;
>>       int idx = hwc->idx;
>> +    u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>>         if (armv8pmu_event_is_chained(event))
>> -        armv8pmu_disable_counter(idx - 1);
>> -    armv8pmu_disable_counter(idx);
>> +        counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
>> +
>> +    kvm_clr_pmu_events(counter_bits);
>> +
>> +    if (!attr->exclude_host) {
>> +        if (armv8pmu_event_is_chained(event))
>> +            armv8pmu_disable_counter(idx - 1);
>> +        armv8pmu_disable_counter(idx);
>> +    }
> 
> Shouldn't we disable the events, irrespective of whether it is for host
> or/and guest ? Otherwise looks good to me.
> 

I made the opposite remark on one of the early version. My reasoning is
that, if we rely on the hypervisor to enable event that exclude the
host, we should also rely on the hypervisor disabling these events.

At least in my mind it makes more sense this way.

Cheers,

-- 
Julien Thierry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-12 10:47     ` Julien Thierry
@ 2018-12-12 10:51       ` Suzuki K Poulose
  0 siblings, 0 replies; 32+ messages in thread
From: Suzuki K Poulose @ 2018-12-12 10:51 UTC (permalink / raw)
  To: Julien Thierry, Andrew Murray, Christoffer Dall, Marc Zyngier,
	Catalin Marinas, Will Deacon, Mark Rutland
  Cc: kvmarm, linux-arm-kernel



On 12/12/2018 10:47, Julien Thierry wrote:
> 
> 
> On 12/12/2018 10:42, Suzuki K Poulose wrote:
>>
>>
>> On 12/12/2018 10:29, Andrew Murray wrote:
>>> Add support for the :G and :H attributes in perf by handling the
>>> exclude_host/exclude_guest event attributes.
>>>
>>> We notify KVM of counters that we wish to be enabled or disabled on
>>> guest entry/exit and thus defer from starting or stopping :G events
>>> as per the events exclude_host attribute.
>>>
>>> With both VHE and non-VHE we switch the counters between host/guest
>>> at EL2. We are able to eliminate counters counting host events on
>>> the boundaries of guest entry/exit when using :G by filtering out
>>> EL2 for exclude_host. However when using :H unless exclude_hv is set
>>> on non-VHE then there is a small blackout window at the guest
>>> entry/exit where host events are not captured.
>>>
>>> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
>>> ---
>>>    arch/arm64/kernel/perf_event.c | 51
>>> ++++++++++++++++++++++++++++++++++++------
>>>    1 file changed, 44 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/arm64/kernel/perf_event.c
>>> b/arch/arm64/kernel/perf_event.c
>>> index de564ae..4a3c73d 100644
>>> --- a/arch/arm64/kernel/perf_event.c
>>> +++ b/arch/arm64/kernel/perf_event.c
>>> @@ -26,6 +26,7 @@
>>>      #include <linux/acpi.h>
>>>    #include <linux/clocksource.h>
>>> +#include <linux/kvm_host.h>
>>>    #include <linux/of.h>
>>>    #include <linux/perf/arm_pmu.h>
>>>    #include <linux/platform_device.h>
>>> @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
>>>      static inline void armv8pmu_enable_event_counter(struct perf_event
>>> *event)
>>>    {
>>> +    struct perf_event_attr *attr = &event->attr;
>>>        int idx = event->hw.idx;
>>> +    int flags = 0;
>>> +    u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>>>    -    armv8pmu_enable_counter(idx);
>>>        if (armv8pmu_event_is_chained(event))
>>> -        armv8pmu_enable_counter(idx - 1);
>>> +        counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
>>> +
>>> +    if (!attr->exclude_host)
>>> +        flags |= KVM_PMU_EVENTS_HOST;
>>> +    if (!attr->exclude_guest)
>>> +        flags |= KVM_PMU_EVENTS_GUEST;
>>> +
>>> +    kvm_set_pmu_events(counter_bits, flags);
>>> +
>>> +    if (!attr->exclude_host) {
>>> +        armv8pmu_enable_counter(idx);
>>> +        if (armv8pmu_event_is_chained(event))
>>> +            armv8pmu_enable_counter(idx - 1);
>>> +    }
>>>    }
>>>      static inline int armv8pmu_disable_counter(int idx)
>>> @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
>>>    static inline void armv8pmu_disable_event_counter(struct perf_event
>>> *event)
>>>    {
>>>        struct hw_perf_event *hwc = &event->hw;
>>> +    struct perf_event_attr *attr = &event->attr;
>>>        int idx = hwc->idx;
>>> +    u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>>>          if (armv8pmu_event_is_chained(event))
>>> -        armv8pmu_disable_counter(idx - 1);
>>> -    armv8pmu_disable_counter(idx);
>>> +        counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
>>> +
>>> +    kvm_clr_pmu_events(counter_bits);
>>> +
>>> +    if (!attr->exclude_host) {
>>> +        if (armv8pmu_event_is_chained(event))
>>> +            armv8pmu_disable_counter(idx - 1);
>>> +        armv8pmu_disable_counter(idx);
>>> +    }
>>
>> Shouldn't we disable the events, irrespective of whether it is for host
>> or/and guest ? Otherwise looks good to me.
>>
> 
> I made the opposite remark on one of the early version. My reasoning is
> that, if we rely on the hypervisor to enable event that exclude the
> host, we should also rely on the hypervisor disabling these events.
> 
> At least in my mind it makes more sense this way.

Ah! ok. Makes sense. I was thinking about something similar in the way
we apply the filters (set/clr) in the KVM hook. It may make sense to add
in a comment.

Sorry for the noise.

Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers
  2018-12-12 10:29 ` [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
@ 2018-12-12 10:53   ` Suzuki K Poulose
  2018-12-12 14:46     ` Andrew Murray
  0 siblings, 1 reply; 32+ messages in thread
From: Suzuki K Poulose @ 2018-12-12 10:53 UTC (permalink / raw)
  To: Andrew Murray, Christoffer Dall, Marc Zyngier, Catalin Marinas,
	Will Deacon, Mark Rutland
  Cc: kvmarm, linux-arm-kernel, Julien Thierry



On 12/12/2018 10:29, Andrew Murray wrote:
> Enable/disable event counters as appropriate when entering and exiting
> the guest to enable support for guest or host only event counting.
> 
> For both VHE and non-VHE we switch the counters between host/guest at
> EL2. EL2 is filtered out by the PMU when we are using the :G modifier.
> 
> The PMU may be on when we change which counters are enabled however
> we avoid adding an isb as we instead rely on existing context
> synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and
> the eret from the hvc in kvm_call_hyp.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>   arch/arm64/kvm/hyp/switch.c | 52 +++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 52 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index d496ef5..9732ef7 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -373,6 +373,46 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>   	return true;
>   }
>   
> +static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
> +{
> +	struct kvm_host_data *host;
> +	struct kvm_pmu_events *pmu;
> +	u32 clr, set;
> +
> +	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> +	pmu = &host->pmu_events;
> +
> +	clr = pmu->events_host & ~pmu->events_guest;
> +	set = pmu->events_guest & ~pmu->events_host;

It may make sense to add in a comment explaining why we only set:

	events_guest & ~events_host

Either way:

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
  2018-12-12 10:42   ` Suzuki K Poulose
@ 2018-12-12 10:55   ` Suzuki K Poulose
  2018-12-12 14:38     ` Andrew Murray
  2018-12-18 12:02   ` Christoffer Dall
  2 siblings, 1 reply; 32+ messages in thread
From: Suzuki K Poulose @ 2018-12-12 10:55 UTC (permalink / raw)
  To: Andrew Murray, Christoffer Dall, Marc Zyngier, Catalin Marinas,
	Will Deacon, Mark Rutland
  Cc: kvmarm, linux-arm-kernel, Julien Thierry



On 12/12/2018 10:29, Andrew Murray wrote:
> Add support for the :G and :H attributes in perf by handling the
> exclude_host/exclude_guest event attributes.
> 
> We notify KVM of counters that we wish to be enabled or disabled on
> guest entry/exit and thus defer from starting or stopping :G events
> as per the events exclude_host attribute.
> 
> With both VHE and non-VHE we switch the counters between host/guest
> at EL2. We are able to eliminate counters counting host events on
> the boundaries of guest entry/exit when using :G by filtering out
> EL2 for exclude_host. However when using :H unless exclude_hv is set
> on non-VHE then there is a small blackout window at the guest
> entry/exit where host events are not captured.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>   arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
>   1 file changed, 44 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index de564ae..4a3c73d 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -26,6 +26,7 @@
>   
>   #include <linux/acpi.h>
>   #include <linux/clocksource.h>
> +#include <linux/kvm_host.h>
>   #include <linux/of.h>
>   #include <linux/perf/arm_pmu.h>
>   #include <linux/platform_device.h>
> @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
>   
>   static inline void armv8pmu_enable_event_counter(struct perf_event *event)
>   {
> +	struct perf_event_attr *attr = &event->attr;
>   	int idx = event->hw.idx;
> +	int flags = 0;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>   
> -	armv8pmu_enable_counter(idx);
>   	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_enable_counter(idx - 1);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	if (!attr->exclude_host)
> +		flags |= KVM_PMU_EVENTS_HOST;
> +	if (!attr->exclude_guest)
> +		flags |= KVM_PMU_EVENTS_GUEST;
> +
> +	kvm_set_pmu_events(counter_bits, flags);
> +
> +	if (!attr->exclude_host) {
> +		armv8pmu_enable_counter(idx);
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_enable_counter(idx - 1);
> +	}
>   }
>   
>   static inline int armv8pmu_disable_counter(int idx)
> @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
>   static inline void armv8pmu_disable_event_counter(struct perf_event *event)
>   {
>   	struct hw_perf_event *hwc = &event->hw;
> +	struct perf_event_attr *attr = &event->attr;
>   	int idx = hwc->idx;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>   
>   	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_disable_counter(idx - 1);
> -	armv8pmu_disable_counter(idx);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	kvm_clr_pmu_events(counter_bits);
> +
> +	if (!attr->exclude_host) {
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_disable_counter(idx - 1);
> +		armv8pmu_disable_counter(idx);
> +	}

It may be helpful to add in a comment why we do this only for !exclude_host.

Either way,

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters
  2018-12-12 10:29 ` [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
@ 2018-12-12 10:56   ` Suzuki K Poulose
  0 siblings, 0 replies; 32+ messages in thread
From: Suzuki K Poulose @ 2018-12-12 10:56 UTC (permalink / raw)
  To: Andrew Murray, Christoffer Dall, Marc Zyngier, Catalin Marinas,
	Will Deacon, Mark Rutland
  Cc: kvmarm, linux-arm-kernel, Julien Thierry



On 12/12/2018 10:29, Andrew Murray wrote:
> In order to effeciently enable/disable guest/host only perf counters
> at guest entry/exit we add bitfields to kvm_cpu_context for guest and
> host events as well as accessors for updating them.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  2018-12-12 10:37   ` Suzuki K Poulose
@ 2018-12-12 12:31     ` Andrew Murray
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 12:31 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	Will Deacon, Christoffer Dall, kvmarm, linux-arm-kernel

On Wed, Dec 12, 2018 at 10:37:54AM +0000, Suzuki K Poulose wrote:
> Hi Andrew,
> 
> On 12/12/2018 10:29, Andrew Murray wrote:
> > The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
> > a typedef to kvm_cpu_context and is used to store host cpu context. The
> > kvm_cpu_context structure is also used elsewhere to hold vcpu context.
> > In order to use the percpu to hold additional future host information we
> > encapsulate kvm_cpu_context in a new structure and rename the typedef and
> > percpu to match.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >   arch/arm/include/asm/kvm_host.h   |  8 ++++++--
> >   arch/arm64/include/asm/kvm_asm.h  |  4 ++--
> >   arch/arm64/include/asm/kvm_host.h | 15 ++++++++++-----
> >   arch/arm64/kernel/asm-offsets.c   |  2 +-
> >   virt/kvm/arm/arm.c                | 10 ++++++----
> >   5 files changed, 25 insertions(+), 14 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> > index 79906ce..71645ba 100644
> > --- a/arch/arm/include/asm/kvm_host.h
> > +++ b/arch/arm/include/asm/kvm_host.h
> > @@ -145,7 +145,11 @@ struct kvm_cpu_context {
> >   	u32 cp15[NR_CP15_REGS];
> >   };
> > -typedef struct kvm_cpu_context kvm_cpu_context_t;
> > +struct kvm_host_data {
> > +	struct kvm_cpu_context host_ctxt;
> > +};
> > +
> > +typedef struct kvm_host_data kvm_host_data_t;
> >   struct kvm_vcpu_arch {
> >   	struct kvm_cpu_context ctxt;
> > @@ -163,7 +167,7 @@ struct kvm_vcpu_arch {
> >   	struct kvm_vcpu_fault_info fault;
> >   	/* Host FP context */
> > -	kvm_cpu_context_t *host_cpu_context;
> > +	struct kvm_cpu_context *host_cpu_context;
> >   	/* VGIC state */
> >   	struct vgic_cpu vgic_cpu;
> > diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> > index 102b5a5..6a9bfd4 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -102,12 +102,12 @@ extern u32 __init_stage2_translation(void);
> >   .endm
> >   .macro get_host_ctxt reg, tmp
> > -	hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp
> > +	hyp_adr_this_cpu \reg, kvm_host_data, \tmp
> >   .endm
> >   .macro get_vcpu_ptr vcpu, ctxt
> >   	get_host_ctxt \ctxt, \vcpu
> > -	ldr	\vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
> > +	ldr	\vcpu, [\ctxt, #HOST_DATA_VCPU]
> >   	kern_hyp_va	\vcpu
> >   .endm
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 1550192..1d3ca91 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -205,7 +205,12 @@ struct kvm_cpu_context {
> >   	struct kvm_vcpu *__hyp_running_vcpu;
> >   };
> > -typedef struct kvm_cpu_context kvm_cpu_context_t;
> > +struct kvm_host_data {
> > +	struct kvm_cpu_context host_ctxt;
> > +	struct kvm_pmu_events pmu_events;
> > +};
> > +
> 
> Unfortunately this would fail to build as we don't have the kvm_pmu_events
> defined in this patch and will break the git bisects.

Doh, this must have crept in when I was rebasing.

Andrew Murray

> 
> Otherwise looks good to me.
> 
> 
> Cheers
> Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-12 10:55   ` Suzuki K Poulose
@ 2018-12-12 14:38     ` Andrew Murray
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 14:38 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	Will Deacon, Christoffer Dall, kvmarm, linux-arm-kernel

On Wed, Dec 12, 2018 at 10:55:33AM +0000, Suzuki K Poulose wrote:
> 
> 
> On 12/12/2018 10:29, Andrew Murray wrote:
> > Add support for the :G and :H attributes in perf by handling the
> > exclude_host/exclude_guest event attributes.
> > 
> > We notify KVM of counters that we wish to be enabled or disabled on
> > guest entry/exit and thus defer from starting or stopping :G events
> > as per the events exclude_host attribute.
> > 
> > With both VHE and non-VHE we switch the counters between host/guest
> > at EL2. We are able to eliminate counters counting host events on
> > the boundaries of guest entry/exit when using :G by filtering out
> > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > on non-VHE then there is a small blackout window at the guest
> > entry/exit where host events are not captured.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >   arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> >   1 file changed, 44 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index de564ae..4a3c73d 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -26,6 +26,7 @@
> >   #include <linux/acpi.h>
> >   #include <linux/clocksource.h>
> > +#include <linux/kvm_host.h>
> >   #include <linux/of.h>
> >   #include <linux/perf/arm_pmu.h>
> >   #include <linux/platform_device.h>
> > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> >   static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> >   {
> > +	struct perf_event_attr *attr = &event->attr;
> >   	int idx = event->hw.idx;
> > +	int flags = 0;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > -	armv8pmu_enable_counter(idx);
> >   	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_enable_counter(idx - 1);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (!attr->exclude_host)
> > +		flags |= KVM_PMU_EVENTS_HOST;
> > +	if (!attr->exclude_guest)
> > +		flags |= KVM_PMU_EVENTS_GUEST;
> > +
> > +	kvm_set_pmu_events(counter_bits, flags);
> > +
> > +	if (!attr->exclude_host) {
> > +		armv8pmu_enable_counter(idx);
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_enable_counter(idx - 1);
> > +	}
> >   }
> >   static inline int armv8pmu_disable_counter(int idx)
> > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> >   static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> >   {
> >   	struct hw_perf_event *hwc = &event->hw;
> > +	struct perf_event_attr *attr = &event->attr;
> >   	int idx = hwc->idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >   	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_disable_counter(idx - 1);
> > -	armv8pmu_disable_counter(idx);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	kvm_clr_pmu_events(counter_bits);
> > +
> > +	if (!attr->exclude_host) {
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_disable_counter(idx - 1);
> > +		armv8pmu_disable_counter(idx);
> > +	}
> 
> It may be helpful to add in a comment why we do this only for !exclude_host.
> 
> Either way,

No problem, I'll add the comments.

> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers
  2018-12-12 10:53   ` Suzuki K Poulose
@ 2018-12-12 14:46     ` Andrew Murray
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Murray @ 2018-12-12 14:46 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	Will Deacon, Christoffer Dall, kvmarm, linux-arm-kernel

On Wed, Dec 12, 2018 at 10:53:35AM +0000, Suzuki K Poulose wrote:
> 
> 
> On 12/12/2018 10:29, Andrew Murray wrote:
> > Enable/disable event counters as appropriate when entering and exiting
> > the guest to enable support for guest or host only event counting.
> > 
> > For both VHE and non-VHE we switch the counters between host/guest at
> > EL2. EL2 is filtered out by the PMU when we are using the :G modifier.
> > 
> > The PMU may be on when we change which counters are enabled however
> > we avoid adding an isb as we instead rely on existing context
> > synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and
> > the eret from the hvc in kvm_call_hyp.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >   arch/arm64/kvm/hyp/switch.c | 52 +++++++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 52 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index d496ef5..9732ef7 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -373,6 +373,46 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >   	return true;
> >   }
> > +static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
> > +{
> > +	struct kvm_host_data *host;
> > +	struct kvm_pmu_events *pmu;
> > +	u32 clr, set;
> > +
> > +	host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
> > +	pmu = &host->pmu_events;
> > +
> > +	clr = pmu->events_host & ~pmu->events_guest;
> > +	set = pmu->events_guest & ~pmu->events_host;
> 
> It may make sense to add in a comment explaining why we only set:
> 
> 	events_guest & ~events_host
> 

Yes I'll add a comment, especially as I've just spend 5 minutes trying to
remember why I did this.

Instead of assigning 'pmu->events_guest & ~pmu->events_host' to 'set' we
could have just assigned 'pmu->events_guest'. However consider the scenario
where an event is enabled for both host and guest - this would have resulted
in us always writing the (already set) bit to the system register. Therefore
with my approach we can potentially avoid any system register writes on a 
hypervisor switch when the only events enabled are those for both host and
guest.

Thanks,

Andrew Murray

> Either way:
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  2018-12-12 10:29 ` [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
  2018-12-12 10:37   ` Suzuki K Poulose
@ 2018-12-18 11:40   ` Christoffer Dall
  1 sibling, 0 replies; 32+ messages in thread
From: Christoffer Dall @ 2018-12-18 11:40 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Dec 12, 2018 at 10:29:30AM +0000, Andrew Murray wrote:
> The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
> a typedef to kvm_cpu_context and is used to store host cpu context. The
> kvm_cpu_context structure is also used elsewhere to hold vcpu context.
> In order to use the percpu to hold additional future host information we
> encapsulate kvm_cpu_context in a new structure and rename the typedef and
> percpu to match.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   |  8 ++++++--
>  arch/arm64/include/asm/kvm_asm.h  |  4 ++--
>  arch/arm64/include/asm/kvm_host.h | 15 ++++++++++-----
>  arch/arm64/kernel/asm-offsets.c   |  2 +-
>  virt/kvm/arm/arm.c                | 10 ++++++----
>  5 files changed, 25 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 79906ce..71645ba 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -145,7 +145,11 @@ struct kvm_cpu_context {
>  	u32 cp15[NR_CP15_REGS];
>  };
>  
> -typedef struct kvm_cpu_context kvm_cpu_context_t;
> +struct kvm_host_data {
> +	struct kvm_cpu_context host_ctxt;
> +};
> +
> +typedef struct kvm_host_data kvm_host_data_t;
>  
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> @@ -163,7 +167,7 @@ struct kvm_vcpu_arch {
>  	struct kvm_vcpu_fault_info fault;
>  
>  	/* Host FP context */
> -	kvm_cpu_context_t *host_cpu_context;
> +	struct kvm_cpu_context *host_cpu_context;
>  
>  	/* VGIC state */
>  	struct vgic_cpu vgic_cpu;
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 102b5a5..6a9bfd4 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -102,12 +102,12 @@ extern u32 __init_stage2_translation(void);
>  .endm
>  
>  .macro get_host_ctxt reg, tmp
> -	hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp
> +	hyp_adr_this_cpu \reg, kvm_host_data, \tmp
>  .endm
>  
>  .macro get_vcpu_ptr vcpu, ctxt
>  	get_host_ctxt \ctxt, \vcpu
> -	ldr	\vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
> +	ldr	\vcpu, [\ctxt, #HOST_DATA_VCPU]
>  	kern_hyp_va	\vcpu
>  .endm
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 1550192..1d3ca91 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -205,7 +205,12 @@ struct kvm_cpu_context {
>  	struct kvm_vcpu *__hyp_running_vcpu;
>  };
>  
> -typedef struct kvm_cpu_context kvm_cpu_context_t;
> +struct kvm_host_data {
> +	struct kvm_cpu_context host_ctxt;
> +	struct kvm_pmu_events pmu_events;
> +};
> +
> +typedef struct kvm_host_data kvm_host_data_t;
>  
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> @@ -241,7 +246,7 @@ struct kvm_vcpu_arch {
>  	struct kvm_guest_debug_arch external_debug_state;
>  
>  	/* Pointer to host CPU context */
> -	kvm_cpu_context_t *host_cpu_context;
> +	struct kvm_cpu_context *host_cpu_context;
>  
>  	struct thread_info *host_thread_info;	/* hyp VA */
>  	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> @@ -387,7 +392,7 @@ void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
>  
>  struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  
> -DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
> +DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
>  
>  void __kvm_enable_ssbs(void);
>  
> @@ -400,8 +405,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>  	 * kernel's mapping to the linear mapping, and store it in tpidr_el2
>  	 * so that we can use adr_l to access per-cpu variables in EL2.
>  	 */
> -	u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_cpu_state) -
> -			 (u64)kvm_ksym_ref(kvm_host_cpu_state));
> +	u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_data) -
> +			 (u64)kvm_ksym_ref(kvm_host_data));
>  
>  	/*
>  	 * Call initialization code, and switch to the full blown HYP code.
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 323aeb5..cb968ff 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -142,7 +142,7 @@ int main(void)
>    DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
>    DEFINE(VCPU_FPEXC32_EL2,	offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2]));
>    DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));
> -  DEFINE(HOST_CONTEXT_VCPU,	offsetof(struct kvm_cpu_context, __hyp_running_vcpu));
> +  DEFINE(HOST_DATA_VCPU,	offsetof(struct kvm_host_data, host_ctxt.__hyp_running_vcpu));
>  #endif
>  #ifdef CONFIG_CPU_PM
>    DEFINE(CPU_SUSPEND_SZ,	sizeof(struct cpu_suspend_ctx));
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 150c8a6..c031ddf 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -56,7 +56,7 @@
>  __asm__(".arch_extension	virt");
>  #endif
>  
> -DEFINE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
> +DEFINE_PER_CPU(kvm_host_data_t, kvm_host_data);
>  static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
>  
>  /* Per-CPU variable containing the currently running vcpu. */
> @@ -361,8 +361,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  {
>  	int *last_ran;
> +	kvm_host_data_t *cpu_ctxt;
>  
>  	last_ran = this_cpu_ptr(vcpu->kvm->arch.last_vcpu_ran);
> +	cpu_ctxt = this_cpu_ptr(&kvm_host_data);

nit: consider renaming to cpu_data
>  
>  	/*
>  	 * We might get preempted before the vCPU actually runs, but
> @@ -374,7 +376,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	}
>  
>  	vcpu->cpu = cpu;
> -	vcpu->arch.host_cpu_context = this_cpu_ptr(&kvm_host_cpu_state);
> +	vcpu->arch.host_cpu_context = &cpu_ctxt->host_ctxt;
>  
>  	kvm_arm_set_running_vcpu(vcpu);
>  	kvm_vgic_load(vcpu);
> @@ -1547,9 +1549,9 @@ static int init_hyp_mode(void)
>  	}
>  
>  	for_each_possible_cpu(cpu) {
> -		kvm_cpu_context_t *cpu_ctxt;
> +		kvm_host_data_t *cpu_ctxt;
>  
> -		cpu_ctxt = per_cpu_ptr(&kvm_host_cpu_state, cpu);
> +		cpu_ctxt = per_cpu_ptr(&kvm_host_data, cpu);

same as above

>  		err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1, PAGE_HYP);
>  
>  		if (err) {
> -- 
> 2.7.4
> 

Otherwise the approach looks good to me.


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
  2018-12-12 10:42   ` Suzuki K Poulose
  2018-12-12 10:55   ` Suzuki K Poulose
@ 2018-12-18 12:02   ` Christoffer Dall
  2018-12-18 13:25     ` Andrew Murray
  2019-01-04 15:32     ` Will Deacon
  2 siblings, 2 replies; 32+ messages in thread
From: Christoffer Dall @ 2018-12-18 12:02 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> Add support for the :G and :H attributes in perf by handling the
> exclude_host/exclude_guest event attributes.
> 
> We notify KVM of counters that we wish to be enabled or disabled on
> guest entry/exit and thus defer from starting or stopping :G events
> as per the events exclude_host attribute.
> 
> With both VHE and non-VHE we switch the counters between host/guest
> at EL2. We are able to eliminate counters counting host events on
> the boundaries of guest entry/exit when using :G by filtering out
> EL2 for exclude_host. However when using :H unless exclude_hv is set
> on non-VHE then there is a small blackout window at the guest
> entry/exit where host events are not captured.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 44 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index de564ae..4a3c73d 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -26,6 +26,7 @@
>  
>  #include <linux/acpi.h>
>  #include <linux/clocksource.h>
> +#include <linux/kvm_host.h>
>  #include <linux/of.h>
>  #include <linux/perf/arm_pmu.h>
>  #include <linux/platform_device.h>
> @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
>  
>  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
>  {
> +	struct perf_event_attr *attr = &event->attr;
>  	int idx = event->hw.idx;
> +	int flags = 0;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>  
> -	armv8pmu_enable_counter(idx);
>  	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_enable_counter(idx - 1);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	if (!attr->exclude_host)
> +		flags |= KVM_PMU_EVENTS_HOST;
> +	if (!attr->exclude_guest)
> +		flags |= KVM_PMU_EVENTS_GUEST;
> +
> +	kvm_set_pmu_events(counter_bits, flags);
> +
> +	if (!attr->exclude_host) {
> +		armv8pmu_enable_counter(idx);
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_enable_counter(idx - 1);
> +	}
>  }
>  
>  static inline int armv8pmu_disable_counter(int idx)
> @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
>  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
>  {
>  	struct hw_perf_event *hwc = &event->hw;
> +	struct perf_event_attr *attr = &event->attr;
>  	int idx = hwc->idx;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>  
>  	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_disable_counter(idx - 1);
> -	armv8pmu_disable_counter(idx);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	kvm_clr_pmu_events(counter_bits);
> +
> +	if (!attr->exclude_host) {
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_disable_counter(idx - 1);
> +		armv8pmu_disable_counter(idx);
> +	}
>  }
>  
>  static inline int armv8pmu_enable_intens(int idx)
> @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  	 * Therefore we ignore exclude_hv in this configuration, since
>  	 * there's no hypervisor to sample anyway. This is consistent
>  	 * with other architectures (x86 and Power).
> +	 *
> +	 * To eliminate counting host events on the boundaries of
> +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> +	 * with !exclude_host.
>  	 */
>  	if (is_kernel_in_hyp_mode()) {
> -		if (!attr->exclude_kernel)
> +		if (!attr->exclude_kernel && !attr->exclude_host)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;
>  	} else {
> -		if (attr->exclude_kernel)
> -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
>  		if (!attr->exclude_hv)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;

I'm not sure about the current use of exclude_hv here.  The comment says
it's consistent with other architectures, but I can't find an example to
confirm this, and I don't think we have a comparable thing to the split
of the hypervisor between EL1 and EL2 we have on non-VHE.

Joerg told me the semantics were designed to be:

	exclude_hv: When running as a guest, stop counting events when
		    the HV runs.

	exclude_host: When Linux runs as a HV itself, only count events
	              while a guest is running.

	exclude_guest: When Linux runs as a HV, only count events when
	               running in host mode.

(But tools/perf/design.txt does not really confirm this).

On arm64 that would mean:

	exclude_hv: As a host, no effect.
		    As a guest, set the counter to include EL2 for a
		    hypervisor to emulate.

	exclude_host: As a guest, has no effect.
		      Don't count EL1 host or EL2, but count EL1 guest
		      by enabling EL1 counting at EL2 when entering a
		      guest, and disabling EL1 counting when returning
		      from a guest.

	exclude_guest: As a guest, has no effect.  As a host, disable
		       EL1 counting at EL2 when entering a guest.

Not sure if we break anything by changing the behavior on arm64 now, but
I really doubt that being able to exclude an arbitrary part (the one tha
happens to run in EL2 on non-VHE systems) is meaningful, and the fact
that behavior and semantics change depending on the version of the
underlying CPU is not great, if what you care about is understanding the
system's performance.

Thoughts?


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-18 12:02   ` Christoffer Dall
@ 2018-12-18 13:25     ` Andrew Murray
  2018-12-18 14:38       ` Christoffer Dall
  2019-01-04 15:32     ` Will Deacon
  1 sibling, 1 reply; 32+ messages in thread
From: Andrew Murray @ 2018-12-18 13:25 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > Add support for the :G and :H attributes in perf by handling the
> > exclude_host/exclude_guest event attributes.
> > 
> > We notify KVM of counters that we wish to be enabled or disabled on
> > guest entry/exit and thus defer from starting or stopping :G events
> > as per the events exclude_host attribute.
> > 
> > With both VHE and non-VHE we switch the counters between host/guest
> > at EL2. We are able to eliminate counters counting host events on
> > the boundaries of guest entry/exit when using :G by filtering out
> > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > on non-VHE then there is a small blackout window at the guest
> > entry/exit where host events are not captured.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 44 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index de564ae..4a3c73d 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -26,6 +26,7 @@
> >  
> >  #include <linux/acpi.h>
> >  #include <linux/clocksource.h>
> > +#include <linux/kvm_host.h>
> >  #include <linux/of.h>
> >  #include <linux/perf/arm_pmu.h>
> >  #include <linux/platform_device.h>
> > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> >  
> >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> >  {
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = event->hw.idx;
> > +	int flags = 0;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> > -	armv8pmu_enable_counter(idx);
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_enable_counter(idx - 1);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (!attr->exclude_host)
> > +		flags |= KVM_PMU_EVENTS_HOST;
> > +	if (!attr->exclude_guest)
> > +		flags |= KVM_PMU_EVENTS_GUEST;
> > +
> > +	kvm_set_pmu_events(counter_bits, flags);
> > +
> > +	if (!attr->exclude_host) {
> > +		armv8pmu_enable_counter(idx);
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_enable_counter(idx - 1);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_disable_counter(int idx)
> > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> >  {
> >  	struct hw_perf_event *hwc = &event->hw;
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = hwc->idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_disable_counter(idx - 1);
> > -	armv8pmu_disable_counter(idx);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	kvm_clr_pmu_events(counter_bits);
> > +
> > +	if (!attr->exclude_host) {
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_disable_counter(idx - 1);
> > +		armv8pmu_disable_counter(idx);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_enable_intens(int idx)
> > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> >  	 * Therefore we ignore exclude_hv in this configuration, since
> >  	 * there's no hypervisor to sample anyway. This is consistent
> >  	 * with other architectures (x86 and Power).
> > +	 *
> > +	 * To eliminate counting host events on the boundaries of
> > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > +	 * with !exclude_host.
> >  	 */
> >  	if (is_kernel_in_hyp_mode()) {
> > -		if (!attr->exclude_kernel)
> > +		if (!attr->exclude_kernel && !attr->exclude_host)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> >  	} else {
> > -		if (attr->exclude_kernel)
> > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> >  		if (!attr->exclude_hv)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> 
> I'm not sure about the current use of exclude_hv here.  The comment says
> it's consistent with other architectures, but I can't find an example to
> confirm this, and I don't think we have a comparable thing to the split
> of the hypervisor between EL1 and EL2 we have on non-VHE.
> 
> Joerg told me the semantics were designed to be:
> 
> 	exclude_hv: When running as a guest, stop counting events when
> 		    the HV runs.

Can the definition of "guest" here refer to both type 1 and type 2
hypervisor guests? Or do we assume type 1 only?

> 
> 	exclude_host: When Linux runs as a HV itself, only count events
> 	              while a guest is running.
> 
> 	exclude_guest: When Linux runs as a HV, only count events when
> 	               running in host mode.
> 
> (But tools/perf/design.txt does not really confirm this).
> 
> On arm64 that would mean:
> 
> 	exclude_hv: As a host, no effect.
> 		    As a guest, set the counter to include EL2 for a
> 		    hypervisor to emulate.

If guest means either type 1 or type 2 guest, then we allow KVM guests
to understand their individual HV overhead. We can do this by counting
EL2 events whilst pinned to the KVM task.

Though more correctly we should count EL2 *and EL1* events whilst pinned
to the KVM task and whilst running outside of the guest. This then
covers both !VHE and VHE and allows for fair comparasion between !VHE
and VHE systems.

This then gives us the unique benefit of the type 2 host being able to
examine the hypervisor overhead of its individual guests.

The only issue here is that the type 2 host wouldn't be able to examine
the HV overhead of all its guests across the system as you wouldn't be
able to rely on the perf task pinning to distinguish between EL1 from
host and EL1 from guests in a !VHE system. I'm not sure the best way
to overcome this limitation.

> 
> 	exclude_host: As a guest, has no effect.
> 		      Don't count EL1 host or EL2, but count EL1 guest
> 		      by enabling EL1 counting at EL2 when entering a
> 		      guest, and disabling EL1 counting when returning
> 		      from a guest.
> 
> 	exclude_guest: As a guest, has no effect.  As a host, disable
> 		       EL1 counting at EL2 when entering a guest.
> 
> Not sure if we break anything by changing the behavior on arm64 now, but
> I really doubt that being able to exclude an arbitrary part (the one tha
> happens to run in EL2 on non-VHE systems) is meaningful, and the fact
> that behavior and semantics change depending on the version of the
> underlying CPU is not great, if what you care about is understanding the
> system's performance.

This is a bit strange. It's arbitary as it only represents a bit of the
HV overhead - this is solved though by counting the whole overhead (EL1
and EL2 instead (but only counting outside the guest and pinned to the
guest tasks).

> 
> Thoughts?
> 

Though if I've understood you correctly, you're suggesting that the only
time we count EL2 is when exclude_hv is not set on the immediate guest
of a type 1 hypervisor?

Thanks,

Andrew Murray

> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-18 13:25     ` Andrew Murray
@ 2018-12-18 14:38       ` Christoffer Dall
  2018-12-18 16:27         ` Andrew Murray
  0 siblings, 1 reply; 32+ messages in thread
From: Christoffer Dall @ 2018-12-18 14:38 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Dec 18, 2018 at 01:25:32PM +0000, Andrew Murray wrote:
> On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > Add support for the :G and :H attributes in perf by handling the
> > > exclude_host/exclude_guest event attributes.
> > > 
> > > We notify KVM of counters that we wish to be enabled or disabled on
> > > guest entry/exit and thus defer from starting or stopping :G events
> > > as per the events exclude_host attribute.
> > > 
> > > With both VHE and non-VHE we switch the counters between host/guest
> > > at EL2. We are able to eliminate counters counting host events on
> > > the boundaries of guest entry/exit when using :G by filtering out
> > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > on non-VHE then there is a small blackout window at the guest
> > > entry/exit where host events are not captured.
> > > 
> > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > ---
> > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > index de564ae..4a3c73d 100644
> > > --- a/arch/arm64/kernel/perf_event.c
> > > +++ b/arch/arm64/kernel/perf_event.c
> > > @@ -26,6 +26,7 @@
> > >  
> > >  #include <linux/acpi.h>
> > >  #include <linux/clocksource.h>
> > > +#include <linux/kvm_host.h>
> > >  #include <linux/of.h>
> > >  #include <linux/perf/arm_pmu.h>
> > >  #include <linux/platform_device.h>
> > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > >  
> > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > >  {
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = event->hw.idx;
> > > +	int flags = 0;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > > -	armv8pmu_enable_counter(idx);
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_enable_counter(idx - 1);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	if (!attr->exclude_host)
> > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > +	if (!attr->exclude_guest)
> > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > +
> > > +	kvm_set_pmu_events(counter_bits, flags);
> > > +
> > > +	if (!attr->exclude_host) {
> > > +		armv8pmu_enable_counter(idx);
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_enable_counter(idx - 1);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_disable_counter(int idx)
> > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > >  {
> > >  	struct hw_perf_event *hwc = &event->hw;
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = hwc->idx;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_disable_counter(idx - 1);
> > > -	armv8pmu_disable_counter(idx);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	kvm_clr_pmu_events(counter_bits);
> > > +
> > > +	if (!attr->exclude_host) {
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_disable_counter(idx - 1);
> > > +		armv8pmu_disable_counter(idx);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_enable_intens(int idx)
> > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > >  	 * there's no hypervisor to sample anyway. This is consistent
> > >  	 * with other architectures (x86 and Power).
> > > +	 *
> > > +	 * To eliminate counting host events on the boundaries of
> > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > +	 * with !exclude_host.
> > >  	 */
> > >  	if (is_kernel_in_hyp_mode()) {
> > > -		if (!attr->exclude_kernel)
> > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > >  	} else {
> > > -		if (attr->exclude_kernel)
> > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > >  		if (!attr->exclude_hv)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > 
> > I'm not sure about the current use of exclude_hv here.  The comment says
> > it's consistent with other architectures, but I can't find an example to
> > confirm this, and I don't think we have a comparable thing to the split
> > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > 
> > Joerg told me the semantics were designed to be:
> > 
> > 	exclude_hv: When running as a guest, stop counting events when
> > 		    the HV runs.
> 
> Can the definition of "guest" here refer to both type 1 and type 2
> hypervisor guests? Or do we assume type 1 only?
> 

A guest is a guest.  Linux can run as a guest under a hypervisor with a
type 1 or type 2 design, doesn't matter for this conversation.

> > 
> > 	exclude_host: When Linux runs as a HV itself, only count events
> > 	              while a guest is running.
> > 
> > 	exclude_guest: When Linux runs as a HV, only count events when
> > 	               running in host mode.
> > 
> > (But tools/perf/design.txt does not really confirm this).
> > 
> > On arm64 that would mean:
> > 
> > 	exclude_hv: As a host, no effect.
> > 		    As a guest, set the counter to include EL2 for a
> > 		    hypervisor to emulate.

[...]

> 
> Though more correctly we should count EL2 *and EL1* events whilst pinned
> to the KVM task and whilst running outside of the guest.  This then
> covers both !VHE and VHE and allows for fair comparasion between !VHE
> and VHE systems.

Yes, if the guest has cleared exclude_hv and if we can properly detect
that from the hypervisor.

> 
> This then gives us the unique benefit of the type 2 host being able to
> examine the hypervisor overhead of its individual guests.

Not sure I understand this part.

> 
> The only issue here is that the type 2 host wouldn't be able to examine
> the HV overhead of all its guests across the system as you wouldn't be
> able to rely on the perf task pinning to distinguish between EL1 from
> host and EL1 from guests in a !VHE system. I'm not sure the best way
> to overcome this limitation.

Why can't you disable EL1 counting whilst running in the host, and
enable EL1 counting whilst running in the guest?

> > 
> > 	exclude_host: As a guest, has no effect.
> > 		      Don't count EL1 host or EL2, but count EL1 guest
> > 		      by enabling EL1 counting at EL2 when entering a
> > 		      guest, and disabling EL1 counting when returning
> > 		      from a guest.
> > 
> > 	exclude_guest: As a guest, has no effect.  As a host, disable
> > 		       EL1 counting at EL2 when entering a guest.
> > 
> > Not sure if we break anything by changing the behavior on arm64 now, but
> > I really doubt that being able to exclude an arbitrary part (the one tha
> > happens to run in EL2 on non-VHE systems) is meaningful, and the fact
> > that behavior and semantics change depending on the version of the
> > underlying CPU is not great, if what you care about is understanding the
> > system's performance.
> 
> This is a bit strange. It's arbitary as it only represents a bit of the
> HV overhead - this is solved though by counting the whole overhead (EL1
> and EL2 instead (but only counting outside the guest and pinned to the
> guest tasks).

Not sure I understand your point here?

> 
> > 
> > Thoughts?
> > 
> 
> Though if I've understood you correctly, you're suggesting that the only
> time we count EL2 is when exclude_hv is not set on the immediate guest
> of a type 1 hypervisor?
> 
No, I didn't say anything about a type 1 or type 2 hypervisor, and I
think that distinction is completely irrelevant to the discussion at
hand.  I also don't know what an immediate guest is -- is there any
other kind?

I don't think exclude_hv, exclude_host, and exclude_guest are directly
tied to a single CPU mode.  The only 'modes' you need to consider for
Linux are 'guest' and 'host' when Linux can run VMs and, 'self' and
'hypervisor' when Linux is a guest.

When Linux can run VMs, you count EL2 events when exclude_host is not
set.

(When Linux is a guest, and you set/clear exclude_hv, for this to work,
you need some way of informing your hypervisor that you want to know
about events happening in the hypervisor.  This could be a PV interface,
or maybe this can work by the guest setting/clearing the NSH bit in its
virtual PMU registers, which then amusingly can get translated into
actually counting in EL1/EL2 (non-VHE) or EL2 (VHE) by KVM's PMU
emulation code.  The method used is specific to the hypervisor used, but
not specific to whether the hypervisor is type-1 or type-2.)


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-18 14:38       ` Christoffer Dall
@ 2018-12-18 16:27         ` Andrew Murray
  2018-12-18 18:51           ` Christoffer Dall
  0 siblings, 1 reply; 32+ messages in thread
From: Andrew Murray @ 2018-12-18 16:27 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Dec 18, 2018 at 03:38:33PM +0100, Christoffer Dall wrote:
> On Tue, Dec 18, 2018 at 01:25:32PM +0000, Andrew Murray wrote:
> > On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > > Add support for the :G and :H attributes in perf by handling the
> > > > exclude_host/exclude_guest event attributes.
> > > > 
> > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > as per the events exclude_host attribute.
> > > > 
> > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > at EL2. We are able to eliminate counters counting host events on
> > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > on non-VHE then there is a small blackout window at the guest
> > > > entry/exit where host events are not captured.
> > > > 
> > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > ---
> > > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > index de564ae..4a3c73d 100644
> > > > --- a/arch/arm64/kernel/perf_event.c
> > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > @@ -26,6 +26,7 @@
> > > >  
> > > >  #include <linux/acpi.h>
> > > >  #include <linux/clocksource.h>
> > > > +#include <linux/kvm_host.h>
> > > >  #include <linux/of.h>
> > > >  #include <linux/perf/arm_pmu.h>
> > > >  #include <linux/platform_device.h>
> > > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > > >  
> > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > >  {
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = event->hw.idx;
> > > > +	int flags = 0;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > > -	armv8pmu_enable_counter(idx);
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_enable_counter(idx - 1);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	if (!attr->exclude_host)
> > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > +	if (!attr->exclude_guest)
> > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > +
> > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > +
> > > > +	if (!attr->exclude_host) {
> > > > +		armv8pmu_enable_counter(idx);
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_enable_counter(idx - 1);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > >  {
> > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = hwc->idx;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_disable_counter(idx - 1);
> > > > -	armv8pmu_disable_counter(idx);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	kvm_clr_pmu_events(counter_bits);
> > > > +
> > > > +	if (!attr->exclude_host) {
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_disable_counter(idx - 1);
> > > > +		armv8pmu_disable_counter(idx);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > >  	 * with other architectures (x86 and Power).
> > > > +	 *
> > > > +	 * To eliminate counting host events on the boundaries of
> > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > > +	 * with !exclude_host.
> > > >  	 */
> > > >  	if (is_kernel_in_hyp_mode()) {
> > > > -		if (!attr->exclude_kernel)
> > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > >  	} else {
> > > > -		if (attr->exclude_kernel)
> > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > >  		if (!attr->exclude_hv)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > 
> > > I'm not sure about the current use of exclude_hv here.  The comment says
> > > it's consistent with other architectures, but I can't find an example to
> > > confirm this, and I don't think we have a comparable thing to the split
> > > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > > 
> > > Joerg told me the semantics were designed to be:
> > > 
> > > 	exclude_hv: When running as a guest, stop counting events when
> > > 		    the HV runs.
> > 
> > Can the definition of "guest" here refer to both type 1 and type 2
> > hypervisor guests? Or do we assume type 1 only?
> > 
> 
> A guest is a guest.  Linux can run as a guest under a hypervisor with a
> type 1 or type 2 design, doesn't matter for this conversation.
> 
> > > 
> > > 	exclude_host: When Linux runs as a HV itself, only count events
> > > 	              while a guest is running.
> > > 
> > > 	exclude_guest: When Linux runs as a HV, only count events when
> > > 	               running in host mode.
> > > 
> > > (But tools/perf/design.txt does not really confirm this).
> > > 
> > > On arm64 that would mean:
> > > 
> > > 	exclude_hv: As a host, no effect.
> > > 		    As a guest, set the counter to include EL2 for a
> > > 		    hypervisor to emulate.
> 
> [...]
> 
> > 
> > Though more correctly we should count EL2 *and EL1* events whilst pinned
> > to the KVM task and whilst running outside of the guest.  This then
> > covers both !VHE and VHE and allows for fair comparasion between !VHE
> > and VHE systems.
> 
> Yes, if the guest has cleared exclude_hv and if we can properly detect
> that from the hypervisor.
> 
> > 
> > This then gives us the unique benefit of the type 2 host being able to
> > examine the hypervisor overhead of its individual guests.
> 
> Not sure I understand this part.

I got a bit confused here so ignore this. (I thought it would be useful to
measure from the KVM host perspective, the host-only time of the KVM guest,
which I incorrectly thought could also be the exclude_hv flag. Though I
forgot that 'perf stat -e instructions:H kvmtask' from the host should be
equivalent to 'perf stat -e instructions:h' from the guest.)

> 
> > 
> > The only issue here is that the type 2 host wouldn't be able to examine
> > the HV overhead of all its guests across the system as you wouldn't be
> > able to rely on the perf task pinning to distinguish between EL1 from
> > host and EL1 from guests in a !VHE system. I'm not sure the best way
> > to overcome this limitation.
> 
> Why can't you disable EL1 counting whilst running in the host, and
> enable EL1 counting whilst running in the guest?

You can. (I was assuming you could use exclude_hv from KVM host to measure
all the KVM guest overheads - but you would need to know which tasks are
KVM tasks such that you can consider their EL1 time - again ignore this).

> 
> > > 
> > > 	exclude_host: As a guest, has no effect.
> > > 		      Don't count EL1 host or EL2, but count EL1 guest
> > > 		      by enabling EL1 counting at EL2 when entering a
> > > 		      guest, and disabling EL1 counting when returning
> > > 		      from a guest.
> > > 
> > > 	exclude_guest: As a guest, has no effect.  As a host, disable
> > > 		       EL1 counting at EL2 when entering a guest.
> > > 
> > > Not sure if we break anything by changing the behavior on arm64 now, but
> > > I really doubt that being able to exclude an arbitrary part (the one tha
> > > happens to run in EL2 on non-VHE systems) is meaningful, and the fact
> > > that behavior and semantics change depending on the version of the
> > > underlying CPU is not great, if what you care about is understanding the
> > > system's performance.
> > 
> > This is a bit strange. It's arbitary as it only represents a bit of the
> > HV overhead - this is solved though by counting the whole overhead (EL1
> > and EL2 instead (but only counting outside the guest and pinned to the
> > guest tasks).
> 
> Not sure I understand your point here?

I was considering how we support exclude_hv from a KVM guest. If we count
only EL2 for HV then we are not measuring the true overhead, we also need
to include the EL1 time from the KVM host process when on !VHE.


> 
> > 
> > > 
> > > Thoughts?
> > > 
> > 
> > Though if I've understood you correctly, you're suggesting that the only
> > time we count EL2 is when exclude_hv is not set on the immediate guest
> > of a type 1 hypervisor?
> > 
> No, I didn't say anything about a type 1 or type 2 hypervisor, and I
> think that distinction is completely irrelevant to the discussion at
> hand.  I also don't know what an immediate guest is -- is there any
> other kind?

An immediate guest = the host's guest but not a guest-guest or further
(Apologies if I'm making this up as I go along).

> 
> I don't think exclude_hv, exclude_host, and exclude_guest are directly
> tied to a single CPU mode.  The only 'modes' you need to consider for
> Linux are 'guest' and 'host' when Linux can run VMs and, 'self' and
> 'hypervisor' when Linux is a guest.
> 
> When Linux can run VMs, you count EL2 events when exclude_host is not
> set.
> 
> (When Linux is a guest, and you set/clear exclude_hv, for this to work,
> you need some way of informing your hypervisor that you want to know
> about events happening in the hypervisor.  This could be a PV interface,
> or maybe this can work by the guest setting/clearing the NSH bit in its
> virtual PMU registers, which then amusingly can get translated into
> actually counting in EL1/EL2 (non-VHE) or EL2 (VHE) by KVM's PMU
> emulation code.  The method used is specific to the hypervisor used, but
> not specific to whether the hypervisor is type-1 or type-2.)

This is what I had in mind. I think we're aligned despite nomenclature
confusing me.

I think we need to:

 - Add support in KVM guests for exclude_hv.

 - Prevent !exclude_hv from returning a count when is is not a guest
   on a !VHE kernel.

I'm not really clear how we implement the second point. We want to count
EL2 (because that would represent a HV) but only when it is a guest. How
does it know it's a guest? (Is it a guest when it's not a KVM host?).
I'm struggling to see how we update the logic in armv8pmu_set_event_filter
(whilst keeping in mind that we want it to write ARMV8_PMU_INCLUDE_EL2
such that when it's a guest the KVM PMU emulation code can use that to
know to do something different).

Thanks,

Andrew Murray

> 
> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-18 16:27         ` Andrew Murray
@ 2018-12-18 18:51           ` Christoffer Dall
  2018-12-18 19:19             ` Andrew Jones
  0 siblings, 1 reply; 32+ messages in thread
From: Christoffer Dall @ 2018-12-18 18:51 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Dec 18, 2018 at 04:27:05PM +0000, Andrew Murray wrote:
> On Tue, Dec 18, 2018 at 03:38:33PM +0100, Christoffer Dall wrote:
> > On Tue, Dec 18, 2018 at 01:25:32PM +0000, Andrew Murray wrote:
> > > On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > > > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > > > Add support for the :G and :H attributes in perf by handling the
> > > > > exclude_host/exclude_guest event attributes.
> > > > > 
> > > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > > as per the events exclude_host attribute.
> > > > > 
> > > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > > at EL2. We are able to eliminate counters counting host events on
> > > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > > on non-VHE then there is a small blackout window at the guest
> > > > > entry/exit where host events are not captured.
> > > > > 
> > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > > ---
> > > > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > > > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > > > 
> > > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > > index de564ae..4a3c73d 100644
> > > > > --- a/arch/arm64/kernel/perf_event.c
> > > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > > @@ -26,6 +26,7 @@
> > > > >  
> > > > >  #include <linux/acpi.h>
> > > > >  #include <linux/clocksource.h>
> > > > > +#include <linux/kvm_host.h>
> > > > >  #include <linux/of.h>
> > > > >  #include <linux/perf/arm_pmu.h>
> > > > >  #include <linux/platform_device.h>
> > > > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > > > >  
> > > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > > >  {
> > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > >  	int idx = event->hw.idx;
> > > > > +	int flags = 0;
> > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > >  
> > > > > -	armv8pmu_enable_counter(idx);
> > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > -		armv8pmu_enable_counter(idx - 1);
> > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > +
> > > > > +	if (!attr->exclude_host)
> > > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > > +	if (!attr->exclude_guest)
> > > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > > +
> > > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > > +
> > > > > +	if (!attr->exclude_host) {
> > > > > +		armv8pmu_enable_counter(idx);
> > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > +			armv8pmu_enable_counter(idx - 1);
> > > > > +	}
> > > > >  }
> > > > >  
> > > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > > >  {
> > > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > >  	int idx = hwc->idx;
> > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > >  
> > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > -		armv8pmu_disable_counter(idx - 1);
> > > > > -	armv8pmu_disable_counter(idx);
> > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > +
> > > > > +	kvm_clr_pmu_events(counter_bits);
> > > > > +
> > > > > +	if (!attr->exclude_host) {
> > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > +			armv8pmu_disable_counter(idx - 1);
> > > > > +		armv8pmu_disable_counter(idx);
> > > > > +	}
> > > > >  }
> > > > >  
> > > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > > >  	 * with other architectures (x86 and Power).
> > > > > +	 *
> > > > > +	 * To eliminate counting host events on the boundaries of
> > > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > > > +	 * with !exclude_host.
> > > > >  	 */
> > > > >  	if (is_kernel_in_hyp_mode()) {
> > > > > -		if (!attr->exclude_kernel)
> > > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > >  	} else {
> > > > > -		if (attr->exclude_kernel)
> > > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > > >  		if (!attr->exclude_hv)
> > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > 
> > > > I'm not sure about the current use of exclude_hv here.  The comment says
> > > > it's consistent with other architectures, but I can't find an example to
> > > > confirm this, and I don't think we have a comparable thing to the split
> > > > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > > > 
> > > > Joerg told me the semantics were designed to be:
> > > > 
> > > > 	exclude_hv: When running as a guest, stop counting events when
> > > > 		    the HV runs.
> > > 
> > > Can the definition of "guest" here refer to both type 1 and type 2
> > > hypervisor guests? Or do we assume type 1 only?
> > > 
> > 
> > A guest is a guest.  Linux can run as a guest under a hypervisor with a
> > type 1 or type 2 design, doesn't matter for this conversation.
> > 
> > > > 
> > > > 	exclude_host: When Linux runs as a HV itself, only count events
> > > > 	              while a guest is running.
> > > > 
> > > > 	exclude_guest: When Linux runs as a HV, only count events when
> > > > 	               running in host mode.
> > > > 
> > > > (But tools/perf/design.txt does not really confirm this).
> > > > 
> > > > On arm64 that would mean:
> > > > 
> > > > 	exclude_hv: As a host, no effect.
> > > > 		    As a guest, set the counter to include EL2 for a
> > > > 		    hypervisor to emulate.
> > 
> > [...]
> > 
> > > 
> > > Though more correctly we should count EL2 *and EL1* events whilst pinned
> > > to the KVM task and whilst running outside of the guest.  This then
> > > covers both !VHE and VHE and allows for fair comparasion between !VHE
> > > and VHE systems.
> > 
> > Yes, if the guest has cleared exclude_hv and if we can properly detect
> > that from the hypervisor.
> > 
> > > 
> > > This then gives us the unique benefit of the type 2 host being able to
> > > examine the hypervisor overhead of its individual guests.
> > 
> > Not sure I understand this part.
> 
> I got a bit confused here so ignore this. (I thought it would be useful to
> measure from the KVM host perspective, the host-only time of the KVM guest,
> which I incorrectly thought could also be the exclude_hv flag. Though I
> forgot that 'perf stat -e instructions:H kvmtask' from the host should be
> equivalent to 'perf stat -e instructions:h' from the guest.)
> 
> > 
> > > 
> > > The only issue here is that the type 2 host wouldn't be able to examine
> > > the HV overhead of all its guests across the system as you wouldn't be
> > > able to rely on the perf task pinning to distinguish between EL1 from
> > > host and EL1 from guests in a !VHE system. I'm not sure the best way
> > > to overcome this limitation.
> > 
> > Why can't you disable EL1 counting whilst running in the host, and
> > enable EL1 counting whilst running in the guest?
> 
> You can. (I was assuming you could use exclude_hv from KVM host to measure
> all the KVM guest overheads - but you would need to know which tasks are
> KVM tasks such that you can consider their EL1 time - again ignore this).
> 
> > 
> > > > 
> > > > 	exclude_host: As a guest, has no effect.
> > > > 		      Don't count EL1 host or EL2, but count EL1 guest
> > > > 		      by enabling EL1 counting at EL2 when entering a
> > > > 		      guest, and disabling EL1 counting when returning
> > > > 		      from a guest.
> > > > 
> > > > 	exclude_guest: As a guest, has no effect.  As a host, disable
> > > > 		       EL1 counting at EL2 when entering a guest.
> > > > 
> > > > Not sure if we break anything by changing the behavior on arm64 now, but
> > > > I really doubt that being able to exclude an arbitrary part (the one tha
> > > > happens to run in EL2 on non-VHE systems) is meaningful, and the fact
> > > > that behavior and semantics change depending on the version of the
> > > > underlying CPU is not great, if what you care about is understanding the
> > > > system's performance.
> > > 
> > > This is a bit strange. It's arbitary as it only represents a bit of the
> > > HV overhead - this is solved though by counting the whole overhead (EL1
> > > and EL2 instead (but only counting outside the guest and pinned to the
> > > guest tasks).
> > 
> > Not sure I understand your point here?
> 
> I was considering how we support exclude_hv from a KVM guest. If we count
> only EL2 for HV then we are not measuring the true overhead, we also need
> to include the EL1 time from the KVM host process when on !VHE.
> 
> 
> > 
> > > 
> > > > 
> > > > Thoughts?
> > > > 
> > > 
> > > Though if I've understood you correctly, you're suggesting that the only
> > > time we count EL2 is when exclude_hv is not set on the immediate guest
> > > of a type 1 hypervisor?
> > > 
> > No, I didn't say anything about a type 1 or type 2 hypervisor, and I
> > think that distinction is completely irrelevant to the discussion at
> > hand.  I also don't know what an immediate guest is -- is there any
> > other kind?
> 
> An immediate guest = the host's guest but not a guest-guest or further
> (Apologies if I'm making this up as I go along).
> 

I strongly recommend thinking about this in terms of a single
hypervisor, which can run a single layer of guests (no nested
virtualization) first.  Once we've figured out how to do that, we can
try to think about nested virt, and factor in PMU emulation among the
great many things that will break with nested virtualization.

> > 
> > I don't think exclude_hv, exclude_host, and exclude_guest are directly
> > tied to a single CPU mode.  The only 'modes' you need to consider for
> > Linux are 'guest' and 'host' when Linux can run VMs and, 'self' and
> > 'hypervisor' when Linux is a guest.
> > 
> > When Linux can run VMs, you count EL2 events when exclude_host is not
> > set.
> > 
> > (When Linux is a guest, and you set/clear exclude_hv, for this to work,
> > you need some way of informing your hypervisor that you want to know
> > about events happening in the hypervisor.  This could be a PV interface,
> > or maybe this can work by the guest setting/clearing the NSH bit in its
> > virtual PMU registers, which then amusingly can get translated into
> > actually counting in EL1/EL2 (non-VHE) or EL2 (VHE) by KVM's PMU
> > emulation code.  The method used is specific to the hypervisor used, but
> > not specific to whether the hypervisor is type-1 or type-2.)
> 
> This is what I had in mind. I think we're aligned despite nomenclature
> confusing me.
> 
> I think we need to:
> 
>  - Add support in KVM guests for exclude_hv.
> 
>  - Prevent !exclude_hv from returning a count when is is not a guest
>    on a !VHE kernel.
> 
> I'm not really clear how we implement the second point. We want to count
> EL2 (because that would represent a HV) but only when it is a guest. How
> does it know it's a guest? (Is it a guest when it's not a KVM host?).
> I'm struggling to see how we update the logic in armv8pmu_set_event_filter
> (whilst keeping in mind that we want it to write ARMV8_PMU_INCLUDE_EL2
> such that when it's a guest the KVM PMU emulation code can use that to
> know to do something different).
> 

If (!hyp_mode_is_available()) you are either running bare metal or as a
guest.  It should be benign to program the PMU to count EL2 in both
cases, as it should have no effect in the first case.  Alternatively we
could make it so that it only makes a difference if the DT/ACPI
describes the fact that you're a KVM guest.  I don't know the
implementation details of that.


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-18 18:51           ` Christoffer Dall
@ 2018-12-18 19:19             ` Andrew Jones
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Jones @ 2018-12-18 19:19 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Marc Zyngier, Catalin Marinas, joro, Will Deacon, Andrew Murray,
	kvmarm, linux-arm-kernel

On Tue, Dec 18, 2018 at 07:51:21PM +0100, Christoffer Dall wrote:
> On Tue, Dec 18, 2018 at 04:27:05PM +0000, Andrew Murray wrote:
> > On Tue, Dec 18, 2018 at 03:38:33PM +0100, Christoffer Dall wrote:
> > > On Tue, Dec 18, 2018 at 01:25:32PM +0000, Andrew Murray wrote:
> > > > On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > > > > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > > > > Add support for the :G and :H attributes in perf by handling the
> > > > > > exclude_host/exclude_guest event attributes.
> > > > > > 
> > > > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > > > as per the events exclude_host attribute.
> > > > > > 
> > > > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > > > at EL2. We are able to eliminate counters counting host events on
> > > > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > > > on non-VHE then there is a small blackout window at the guest
> > > > > > entry/exit where host events are not captured.
> > > > > > 
> > > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > > > ---
> > > > > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > > > > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > > > index de564ae..4a3c73d 100644
> > > > > > --- a/arch/arm64/kernel/perf_event.c
> > > > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > > > @@ -26,6 +26,7 @@
> > > > > >  
> > > > > >  #include <linux/acpi.h>
> > > > > >  #include <linux/clocksource.h>
> > > > > > +#include <linux/kvm_host.h>
> > > > > >  #include <linux/of.h>
> > > > > >  #include <linux/perf/arm_pmu.h>
> > > > > >  #include <linux/platform_device.h>
> > > > > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > > > > >  
> > > > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > > > >  {
> > > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > > >  	int idx = event->hw.idx;
> > > > > > +	int flags = 0;
> > > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > > >  
> > > > > > -	armv8pmu_enable_counter(idx);
> > > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > > -		armv8pmu_enable_counter(idx - 1);
> > > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > > +
> > > > > > +	if (!attr->exclude_host)
> > > > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > > > +	if (!attr->exclude_guest)
> > > > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > > > +
> > > > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > > > +
> > > > > > +	if (!attr->exclude_host) {
> > > > > > +		armv8pmu_enable_counter(idx);
> > > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > > +			armv8pmu_enable_counter(idx - 1);
> > > > > > +	}
> > > > > >  }
> > > > > >  
> > > > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > > > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > > > >  {
> > > > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > > >  	int idx = hwc->idx;
> > > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > > >  
> > > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > > -		armv8pmu_disable_counter(idx - 1);
> > > > > > -	armv8pmu_disable_counter(idx);
> > > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > > +
> > > > > > +	kvm_clr_pmu_events(counter_bits);
> > > > > > +
> > > > > > +	if (!attr->exclude_host) {
> > > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > > +			armv8pmu_disable_counter(idx - 1);
> > > > > > +		armv8pmu_disable_counter(idx);
> > > > > > +	}
> > > > > >  }
> > > > > >  
> > > > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > > > >  	 * with other architectures (x86 and Power).
> > > > > > +	 *
> > > > > > +	 * To eliminate counting host events on the boundaries of
> > > > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > > > > +	 * with !exclude_host.
> > > > > >  	 */
> > > > > >  	if (is_kernel_in_hyp_mode()) {
> > > > > > -		if (!attr->exclude_kernel)
> > > > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > > >  	} else {
> > > > > > -		if (attr->exclude_kernel)
> > > > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > > > >  		if (!attr->exclude_hv)
> > > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > > 
> > > > > I'm not sure about the current use of exclude_hv here.  The comment says
> > > > > it's consistent with other architectures, but I can't find an example to
> > > > > confirm this, and I don't think we have a comparable thing to the split
> > > > > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > > > > 
> > > > > Joerg told me the semantics were designed to be:
> > > > > 
> > > > > 	exclude_hv: When running as a guest, stop counting events when
> > > > > 		    the HV runs.
> > > > 
> > > > Can the definition of "guest" here refer to both type 1 and type 2
> > > > hypervisor guests? Or do we assume type 1 only?
> > > > 
> > > 
> > > A guest is a guest.  Linux can run as a guest under a hypervisor with a
> > > type 1 or type 2 design, doesn't matter for this conversation.
> > > 
> > > > > 
> > > > > 	exclude_host: When Linux runs as a HV itself, only count events
> > > > > 	              while a guest is running.
> > > > > 
> > > > > 	exclude_guest: When Linux runs as a HV, only count events when
> > > > > 	               running in host mode.
> > > > > 
> > > > > (But tools/perf/design.txt does not really confirm this).
> > > > > 
> > > > > On arm64 that would mean:
> > > > > 
> > > > > 	exclude_hv: As a host, no effect.
> > > > > 		    As a guest, set the counter to include EL2 for a
> > > > > 		    hypervisor to emulate.
> > > 
> > > [...]
> > > 
> > > > 
> > > > Though more correctly we should count EL2 *and EL1* events whilst pinned
> > > > to the KVM task and whilst running outside of the guest.  This then
> > > > covers both !VHE and VHE and allows for fair comparasion between !VHE
> > > > and VHE systems.
> > > 
> > > Yes, if the guest has cleared exclude_hv and if we can properly detect
> > > that from the hypervisor.
> > > 
> > > > 
> > > > This then gives us the unique benefit of the type 2 host being able to
> > > > examine the hypervisor overhead of its individual guests.
> > > 
> > > Not sure I understand this part.
> > 
> > I got a bit confused here so ignore this. (I thought it would be useful to
> > measure from the KVM host perspective, the host-only time of the KVM guest,
> > which I incorrectly thought could also be the exclude_hv flag. Though I
> > forgot that 'perf stat -e instructions:H kvmtask' from the host should be
> > equivalent to 'perf stat -e instructions:h' from the guest.)
> > 
> > > 
> > > > 
> > > > The only issue here is that the type 2 host wouldn't be able to examine
> > > > the HV overhead of all its guests across the system as you wouldn't be
> > > > able to rely on the perf task pinning to distinguish between EL1 from
> > > > host and EL1 from guests in a !VHE system. I'm not sure the best way
> > > > to overcome this limitation.
> > > 
> > > Why can't you disable EL1 counting whilst running in the host, and
> > > enable EL1 counting whilst running in the guest?
> > 
> > You can. (I was assuming you could use exclude_hv from KVM host to measure
> > all the KVM guest overheads - but you would need to know which tasks are
> > KVM tasks such that you can consider their EL1 time - again ignore this).
> > 
> > > 
> > > > > 
> > > > > 	exclude_host: As a guest, has no effect.
> > > > > 		      Don't count EL1 host or EL2, but count EL1 guest
> > > > > 		      by enabling EL1 counting at EL2 when entering a
> > > > > 		      guest, and disabling EL1 counting when returning
> > > > > 		      from a guest.
> > > > > 
> > > > > 	exclude_guest: As a guest, has no effect.  As a host, disable
> > > > > 		       EL1 counting at EL2 when entering a guest.
> > > > > 
> > > > > Not sure if we break anything by changing the behavior on arm64 now, but
> > > > > I really doubt that being able to exclude an arbitrary part (the one tha
> > > > > happens to run in EL2 on non-VHE systems) is meaningful, and the fact
> > > > > that behavior and semantics change depending on the version of the
> > > > > underlying CPU is not great, if what you care about is understanding the
> > > > > system's performance.
> > > > 
> > > > This is a bit strange. It's arbitary as it only represents a bit of the
> > > > HV overhead - this is solved though by counting the whole overhead (EL1
> > > > and EL2 instead (but only counting outside the guest and pinned to the
> > > > guest tasks).
> > > 
> > > Not sure I understand your point here?
> > 
> > I was considering how we support exclude_hv from a KVM guest. If we count
> > only EL2 for HV then we are not measuring the true overhead, we also need
> > to include the EL1 time from the KVM host process when on !VHE.
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > Thoughts?
> > > > > 
> > > > 
> > > > Though if I've understood you correctly, you're suggesting that the only
> > > > time we count EL2 is when exclude_hv is not set on the immediate guest
> > > > of a type 1 hypervisor?
> > > > 
> > > No, I didn't say anything about a type 1 or type 2 hypervisor, and I
> > > think that distinction is completely irrelevant to the discussion at
> > > hand.  I also don't know what an immediate guest is -- is there any
> > > other kind?
> > 
> > An immediate guest = the host's guest but not a guest-guest or further
> > (Apologies if I'm making this up as I go along).
> > 
> 
> I strongly recommend thinking about this in terms of a single
> hypervisor, which can run a single layer of guests (no nested
> virtualization) first.  Once we've figured out how to do that, we can
> try to think about nested virt, and factor in PMU emulation among the
> great many things that will break with nested virtualization.
> 
> > > 
> > > I don't think exclude_hv, exclude_host, and exclude_guest are directly
> > > tied to a single CPU mode.  The only 'modes' you need to consider for
> > > Linux are 'guest' and 'host' when Linux can run VMs and, 'self' and
> > > 'hypervisor' when Linux is a guest.
> > > 
> > > When Linux can run VMs, you count EL2 events when exclude_host is not
> > > set.
> > > 
> > > (When Linux is a guest, and you set/clear exclude_hv, for this to work,
> > > you need some way of informing your hypervisor that you want to know
> > > about events happening in the hypervisor.  This could be a PV interface,
> > > or maybe this can work by the guest setting/clearing the NSH bit in its
> > > virtual PMU registers, which then amusingly can get translated into
> > > actually counting in EL1/EL2 (non-VHE) or EL2 (VHE) by KVM's PMU
> > > emulation code.  The method used is specific to the hypervisor used, but
> > > not specific to whether the hypervisor is type-1 or type-2.)
> > 
> > This is what I had in mind. I think we're aligned despite nomenclature
> > confusing me.
> > 
> > I think we need to:
> > 
> >  - Add support in KVM guests for exclude_hv.
> > 
> >  - Prevent !exclude_hv from returning a count when is is not a guest
> >    on a !VHE kernel.
> > 
> > I'm not really clear how we implement the second point. We want to count
> > EL2 (because that would represent a HV) but only when it is a guest. How
> > does it know it's a guest? (Is it a guest when it's not a KVM host?).
> > I'm struggling to see how we update the logic in armv8pmu_set_event_filter
> > (whilst keeping in mind that we want it to write ARMV8_PMU_INCLUDE_EL2
> > such that when it's a guest the KVM PMU emulation code can use that to
> > know to do something different).
> > 
> 
> If (!hyp_mode_is_available()) you are either running bare metal or as a
> guest.  It should be benign to program the PMU to count EL2 in both
> cases, as it should have no effect in the first case.  Alternatively we
> could make it so that it only makes a difference if the DT/ACPI
> describes the fact that you're a KVM guest.  I don't know the
> implementation details of that.

That would be a userspace specific trigger, as nothing requires userspace
to describe anything like that. QEMU does inform the guest that it is a
KVM guest through SMBIOS for ACPI boots, though. When the guest is boot
with DT the guest can infer that it's at least a QEMU guest by the
existence of the fw-cfg DT node. However that heuristic isn't good enough
to tell the difference between TCG and KVM, so it's unlikely to be helpful
here.

There were some patches on the KVM list once proposing that KVM provide
a PV interface allowing KVM guests to populate the /sys/hypervisor sysfs
node (which was invented for xen). But those patches were for x86, and
also seem to have been abandoned.

Thanks,
drew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-18 12:02   ` Christoffer Dall
  2018-12-18 13:25     ` Andrew Murray
@ 2019-01-04 15:32     ` Will Deacon
  2019-01-08 10:18       ` Christoffer Dall
  1 sibling, 1 reply; 32+ messages in thread
From: Will Deacon @ 2019-01-04 15:32 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Andrew Murray, kvmarm, linux-arm-kernel

On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > Add support for the :G and :H attributes in perf by handling the
> > exclude_host/exclude_guest event attributes.
> > 
> > We notify KVM of counters that we wish to be enabled or disabled on
> > guest entry/exit and thus defer from starting or stopping :G events
> > as per the events exclude_host attribute.
> > 
> > With both VHE and non-VHE we switch the counters between host/guest
> > at EL2. We are able to eliminate counters counting host events on
> > the boundaries of guest entry/exit when using :G by filtering out
> > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > on non-VHE then there is a small blackout window at the guest
> > entry/exit where host events are not captured.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 44 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index de564ae..4a3c73d 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -26,6 +26,7 @@
> >  
> >  #include <linux/acpi.h>
> >  #include <linux/clocksource.h>
> > +#include <linux/kvm_host.h>
> >  #include <linux/of.h>
> >  #include <linux/perf/arm_pmu.h>
> >  #include <linux/platform_device.h>
> > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> >  
> >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> >  {
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = event->hw.idx;
> > +	int flags = 0;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> > -	armv8pmu_enable_counter(idx);
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_enable_counter(idx - 1);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (!attr->exclude_host)
> > +		flags |= KVM_PMU_EVENTS_HOST;
> > +	if (!attr->exclude_guest)
> > +		flags |= KVM_PMU_EVENTS_GUEST;
> > +
> > +	kvm_set_pmu_events(counter_bits, flags);
> > +
> > +	if (!attr->exclude_host) {
> > +		armv8pmu_enable_counter(idx);
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_enable_counter(idx - 1);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_disable_counter(int idx)
> > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> >  {
> >  	struct hw_perf_event *hwc = &event->hw;
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = hwc->idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_disable_counter(idx - 1);
> > -	armv8pmu_disable_counter(idx);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	kvm_clr_pmu_events(counter_bits);
> > +
> > +	if (!attr->exclude_host) {
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_disable_counter(idx - 1);
> > +		armv8pmu_disable_counter(idx);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_enable_intens(int idx)
> > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> >  	 * Therefore we ignore exclude_hv in this configuration, since
> >  	 * there's no hypervisor to sample anyway. This is consistent
> >  	 * with other architectures (x86 and Power).
> > +	 *
> > +	 * To eliminate counting host events on the boundaries of
> > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > +	 * with !exclude_host.
> >  	 */
> >  	if (is_kernel_in_hyp_mode()) {
> > -		if (!attr->exclude_kernel)
> > +		if (!attr->exclude_kernel && !attr->exclude_host)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> >  	} else {
> > -		if (attr->exclude_kernel)
> > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> >  		if (!attr->exclude_hv)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> 
> I'm not sure about the current use of exclude_hv here.  The comment says
> it's consistent with other architectures, but I can't find an example to
> confirm this, and I don't think we have a comparable thing to the split
> of the hypervisor between EL1 and EL2 we have on non-VHE.

FWIW, that comment came from this thread:

http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503908.html

That was painful enough at the time, so I'd /really/ prefer not to change
the semantics of this again if we can avoid it.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-04 15:32     ` Will Deacon
@ 2019-01-08 10:18       ` Christoffer Dall
  2019-01-08 11:25         ` Andrew Murray
  0 siblings, 1 reply; 32+ messages in thread
From: Christoffer Dall @ 2019-01-08 10:18 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Andrew Murray, kvmarm, linux-arm-kernel

On Fri, Jan 04, 2019 at 03:32:06PM +0000, Will Deacon wrote:
> On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > Add support for the :G and :H attributes in perf by handling the
> > > exclude_host/exclude_guest event attributes.
> > > 
> > > We notify KVM of counters that we wish to be enabled or disabled on
> > > guest entry/exit and thus defer from starting or stopping :G events
> > > as per the events exclude_host attribute.
> > > 
> > > With both VHE and non-VHE we switch the counters between host/guest
> > > at EL2. We are able to eliminate counters counting host events on
> > > the boundaries of guest entry/exit when using :G by filtering out
> > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > on non-VHE then there is a small blackout window at the guest
> > > entry/exit where host events are not captured.
> > > 
> > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > ---
> > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > index de564ae..4a3c73d 100644
> > > --- a/arch/arm64/kernel/perf_event.c
> > > +++ b/arch/arm64/kernel/perf_event.c
> > > @@ -26,6 +26,7 @@
> > >  
> > >  #include <linux/acpi.h>
> > >  #include <linux/clocksource.h>
> > > +#include <linux/kvm_host.h>
> > >  #include <linux/of.h>
> > >  #include <linux/perf/arm_pmu.h>
> > >  #include <linux/platform_device.h>
> > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > >  
> > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > >  {
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = event->hw.idx;
> > > +	int flags = 0;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > > -	armv8pmu_enable_counter(idx);
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_enable_counter(idx - 1);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	if (!attr->exclude_host)
> > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > +	if (!attr->exclude_guest)
> > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > +
> > > +	kvm_set_pmu_events(counter_bits, flags);
> > > +
> > > +	if (!attr->exclude_host) {
> > > +		armv8pmu_enable_counter(idx);
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_enable_counter(idx - 1);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_disable_counter(int idx)
> > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > >  {
> > >  	struct hw_perf_event *hwc = &event->hw;
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = hwc->idx;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_disable_counter(idx - 1);
> > > -	armv8pmu_disable_counter(idx);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	kvm_clr_pmu_events(counter_bits);
> > > +
> > > +	if (!attr->exclude_host) {
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_disable_counter(idx - 1);
> > > +		armv8pmu_disable_counter(idx);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_enable_intens(int idx)
> > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > >  	 * there's no hypervisor to sample anyway. This is consistent
> > >  	 * with other architectures (x86 and Power).
> > > +	 *
> > > +	 * To eliminate counting host events on the boundaries of
> > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > +	 * with !exclude_host.
> > >  	 */
> > >  	if (is_kernel_in_hyp_mode()) {
> > > -		if (!attr->exclude_kernel)
> > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > >  	} else {
> > > -		if (attr->exclude_kernel)
> > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > >  		if (!attr->exclude_hv)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > 
> > I'm not sure about the current use of exclude_hv here.  The comment says
> > it's consistent with other architectures, but I can't find an example to
> > confirm this, and I don't think we have a comparable thing to the split
> > of the hypervisor between EL1 and EL2 we have on non-VHE.
> 
> FWIW, that comment came from this thread:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503908.html
> 
> That was painful enough at the time, so I'd /really/ prefer not to change
> the semantics of this again if we can avoid it.

The comment makes sense for the is_kernel_in_hyp_mode() case.

However, for the !is_kernel_in_hyp_mode() case I can't see the current
behavior of exclude_hv being similar in other architectures.

I don't think the current semantics of excluding EL2 on a non-VHE host
system makes much sense, and I doubt anyone is using that for something
meaningful.  I think changing behavior for excldue_hv to depend on
is_hyp_mode_available rather than is_kernel_in_hyp_mode is the right
thing to do which would also align the semantics with other
architectures and between VHE and non-VHE.


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-08 10:18       ` Christoffer Dall
@ 2019-01-08 11:25         ` Andrew Murray
  2019-01-08 11:50           ` Marc Zyngier
  2019-01-08 12:14           ` Christoffer Dall
  0 siblings, 2 replies; 32+ messages in thread
From: Andrew Murray @ 2019-01-08 11:25 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Jan 08, 2019 at 11:18:43AM +0100, Christoffer Dall wrote:
> On Fri, Jan 04, 2019 at 03:32:06PM +0000, Will Deacon wrote:
> > On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > > Add support for the :G and :H attributes in perf by handling the
> > > > exclude_host/exclude_guest event attributes.
> > > > 
> > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > as per the events exclude_host attribute.
> > > > 
> > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > at EL2. We are able to eliminate counters counting host events on
> > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > on non-VHE then there is a small blackout window at the guest
> > > > entry/exit where host events are not captured.
> > > > 
> > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > ---
> > > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > index de564ae..4a3c73d 100644
> > > > --- a/arch/arm64/kernel/perf_event.c
> > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > @@ -26,6 +26,7 @@
> > > >  
> > > >  #include <linux/acpi.h>
> > > >  #include <linux/clocksource.h>
> > > > +#include <linux/kvm_host.h>
> > > >  #include <linux/of.h>
> > > >  #include <linux/perf/arm_pmu.h>
> > > >  #include <linux/platform_device.h>
> > > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > > >  
> > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > >  {
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = event->hw.idx;
> > > > +	int flags = 0;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > > -	armv8pmu_enable_counter(idx);
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_enable_counter(idx - 1);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	if (!attr->exclude_host)
> > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > +	if (!attr->exclude_guest)
> > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > +
> > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > +
> > > > +	if (!attr->exclude_host) {
> > > > +		armv8pmu_enable_counter(idx);
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_enable_counter(idx - 1);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > >  {
> > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = hwc->idx;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_disable_counter(idx - 1);
> > > > -	armv8pmu_disable_counter(idx);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	kvm_clr_pmu_events(counter_bits);
> > > > +
> > > > +	if (!attr->exclude_host) {
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_disable_counter(idx - 1);
> > > > +		armv8pmu_disable_counter(idx);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > >  	 * with other architectures (x86 and Power).
> > > > +	 *
> > > > +	 * To eliminate counting host events on the boundaries of
> > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > > +	 * with !exclude_host.
> > > >  	 */
> > > >  	if (is_kernel_in_hyp_mode()) {
> > > > -		if (!attr->exclude_kernel)
> > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > >  	} else {
> > > > -		if (attr->exclude_kernel)
> > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > >  		if (!attr->exclude_hv)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > 
> > > I'm not sure about the current use of exclude_hv here.  The comment says
> > > it's consistent with other architectures, but I can't find an example to
> > > confirm this, and I don't think we have a comparable thing to the split
> > > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > 
> > FWIW, that comment came from this thread:
> > 
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503908.html
> > 
> > That was painful enough at the time, so I'd /really/ prefer not to change
> > the semantics of this again if we can avoid it.
> 
> The comment makes sense for the is_kernel_in_hyp_mode() case.
> 
> However, for the !is_kernel_in_hyp_mode() case I can't see the current
> behavior of exclude_hv being similar in other architectures.
> 
> I don't think the current semantics of excluding EL2 on a non-VHE host
> system makes much sense, and I doubt anyone is using that for something
> meaningful.  I think changing behavior for excldue_hv to depend on
> is_hyp_mode_available rather than is_kernel_in_hyp_mode is the right
> thing to do which would also align the semantics with other
> architectures and between VHE and non-VHE.

Just for clarity, see below for the proposed patch - this disallows EL2
counting for !VHE when we have the capability to be a KVM host.

Subject: [PATCH] arm64: arm_pmu: Disallow EL2 counting on !VHE unless guest

When the kernel runs without VHE support it runs at EL1. However it switches
to EL2 when switching to and from KVM guests. The exclude_hv flag (for a !VHE
kernel) will include EL2 counting. The exclude_hv flag is intended to count
events associated with the hypervisor for the current instance, not the
overhead of the current instance's guests.

Let's disallow EL2 counting for !VHE when we know that we have the capability
to be a KVM host (by virtue that we booted in EL2) and thus probably aren't
a guest ourselves.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/kernel/perf_event.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 1620a37..bd3f6ca 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -832,7 +832,7 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
        } else {
                if (attr->exclude_kernel)
                        config_base |= ARMV8_PMU_EXCLUDE_EL1;
-               if (!attr->exclude_hv)
+               if (!attr->exclude_hv && !is_hyp_mode_is_available())
                        config_base |= ARMV8_PMU_INCLUDE_EL2;
        }
        if (attr->exclude_user)
-- 
2.7.4

My only doubt about this is as follows. If, on a KVM host you run this:

perf stat -e cycles:H lkvm run ...

then on the VHE host the cycles reported represents the entire non-guest cycles
associated with running the guest.

On a !VHE, the cycles reported exclude EL2 (with or without this patch) and
thus you don't get a representation of all the non-guest cycles associated with
the guest. However without this patch you could at least still run:

perf stat -e cycles:H -e cycles:h lkvm run ...

and then add the two cycle counts together to get something comparative with
the VHE host.

If the above patch represents the desired semantics, then perhaps we must count
both EL1 and *EL2* for !exclude_kernel on !VHE. In fact I think we should do
this anyway and remove a little complexity from armv8pmu_set_event_filter.
Thoughts?

Thanks,

Andrew Murray

> 
> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-08 11:25         ` Andrew Murray
@ 2019-01-08 11:50           ` Marc Zyngier
  2019-01-08 12:03             ` Christoffer Dall
  2019-01-08 12:14           ` Christoffer Dall
  1 sibling, 1 reply; 32+ messages in thread
From: Marc Zyngier @ 2019-01-08 11:50 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Julien Thierry, Catalin Marinas, joro,
	Suzuki K Poulose, Will Deacon, Christoffer Dall, kvmarm,
	linux-arm-kernel

On Tue, 08 Jan 2019 11:25:13 +0000,
Andrew Murray <andrew.murray@arm.com> wrote:

Hi Andrew,

> My only doubt about this is as follows. If, on a KVM host you run this:
> 
> perf stat -e cycles:H lkvm run ...
> 
> then on the VHE host the cycles reported represents the entire non-guest cycles
> associated with running the guest.
> 
> On a !VHE, the cycles reported exclude EL2 (with or without this patch) and
> thus you don't get a representation of all the non-guest cycles associated with
> the guest. However without this patch you could at least still run:
> 
> perf stat -e cycles:H -e cycles:h lkvm run ...
> 
> and then add the two cycle counts together to get something comparative with
> the VHE host.
> 
> If the above patch represents the desired semantics, then perhaps we must count
> both EL1 and *EL2* for !exclude_kernel on !VHE. In fact I think we should do
> this anyway and remove a little complexity from armv8pmu_set_event_filter.
> Thoughts?

I'm not sure we should hide the architectural differences between VHE
and !VHE. If you're trying to measure what is happening at in the
hypervisor, you can't reason about it while ignoring the dual nature
of !VHE.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-08 11:50           ` Marc Zyngier
@ 2019-01-08 12:03             ` Christoffer Dall
  2019-01-08 12:12               ` Marc Zyngier
  0 siblings, 1 reply; 32+ messages in thread
From: Christoffer Dall @ 2019-01-08 12:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Julien Thierry, Catalin Marinas, joro,
	Suzuki K Poulose, Will Deacon, Andrew Murray, kvmarm,
	linux-arm-kernel

On Tue, Jan 08, 2019 at 11:50:59AM +0000, Marc Zyngier wrote:
> On Tue, 08 Jan 2019 11:25:13 +0000,
> Andrew Murray <andrew.murray@arm.com> wrote:
> 
> Hi Andrew,
> 
> > My only doubt about this is as follows. If, on a KVM host you run this:
> > 
> > perf stat -e cycles:H lkvm run ...
> > 
> > then on the VHE host the cycles reported represents the entire non-guest cycles
> > associated with running the guest.
> > 
> > On a !VHE, the cycles reported exclude EL2 (with or without this patch) and
> > thus you don't get a representation of all the non-guest cycles associated with
> > the guest. However without this patch you could at least still run:
> > 
> > perf stat -e cycles:H -e cycles:h lkvm run ...
> > 
> > and then add the two cycle counts together to get something comparative with
> > the VHE host.
> > 
> > If the above patch represents the desired semantics, then perhaps we must count
> > both EL1 and *EL2* for !exclude_kernel on !VHE. In fact I think we should do
> > this anyway and remove a little complexity from armv8pmu_set_event_filter.
> > Thoughts?
> 
> I'm not sure we should hide the architectural differences between VHE
> and !VHE. If you're trying to measure what is happening at in the
> hypervisor, you can't reason about it while ignoring the dual nature
> of !VHE.
> 

How do you define hypervisor here?  Is that just the code that runs at
EL2 or also parts of KVM that runs at EL1?

It remains unclear to me why you'd want to measure a subset of KVM,
which happens to run in EL2, in your host (and hypervisor-enabled)
kernel, and you are even precluded from measuring a comparable portion
of your implementation on other Arm systems (VHE).

Admittedly, I'm not at export in using perf, but I find this EL1/EL2
distinction out of place as it relates to exlude_kernel, exlude_user,
and exlude_hv.  Will we have a fourth Arm-specific flag which takes the
place of exclude_hv on PowerPC, which excludes an underlying hypervisor
when running a guest, should we ever support counting that in the
future?


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-08 12:03             ` Christoffer Dall
@ 2019-01-08 12:12               ` Marc Zyngier
  2019-01-08 12:20                 ` Christoffer Dall
  0 siblings, 1 reply; 32+ messages in thread
From: Marc Zyngier @ 2019-01-08 12:12 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Julien Thierry, Catalin Marinas, joro,
	Suzuki K Poulose, Will Deacon, Andrew Murray, kvmarm,
	linux-arm-kernel

On 08/01/2019 12:03, Christoffer Dall wrote:
> On Tue, Jan 08, 2019 at 11:50:59AM +0000, Marc Zyngier wrote:
>> On Tue, 08 Jan 2019 11:25:13 +0000,
>> Andrew Murray <andrew.murray@arm.com> wrote:
>>
>> Hi Andrew,
>>
>>> My only doubt about this is as follows. If, on a KVM host you run this:
>>>
>>> perf stat -e cycles:H lkvm run ...
>>>
>>> then on the VHE host the cycles reported represents the entire non-guest cycles
>>> associated with running the guest.
>>>
>>> On a !VHE, the cycles reported exclude EL2 (with or without this patch) and
>>> thus you don't get a representation of all the non-guest cycles associated with
>>> the guest. However without this patch you could at least still run:
>>>
>>> perf stat -e cycles:H -e cycles:h lkvm run ...
>>>
>>> and then add the two cycle counts together to get something comparative with
>>> the VHE host.
>>>
>>> If the above patch represents the desired semantics, then perhaps we must count
>>> both EL1 and *EL2* for !exclude_kernel on !VHE. In fact I think we should do
>>> this anyway and remove a little complexity from armv8pmu_set_event_filter.
>>> Thoughts?
>>
>> I'm not sure we should hide the architectural differences between VHE
>> and !VHE. If you're trying to measure what is happening at in the
>> hypervisor, you can't reason about it while ignoring the dual nature
>> of !VHE.
>>
> 
> How do you define hypervisor here?  Is that just the code that runs at
> EL2 or also parts of KVM that runs at EL1?

I define it as "not a guest". Whatever is used to support a guest is the
hypervisor.

> It remains unclear to me why you'd want to measure a subset of KVM,
> which happens to run in EL2, in your host (and hypervisor-enabled)
> kernel, and you are even precluded from measuring a comparable portion
> of your implementation on other Arm systems (VHE).

Because I'm not trying to compare apples (VHE) and oranges (!VHE). My
use-case for perf is to measure the impact of a change on a given
implementation, and the more I can narrow the impact of that change, the
better (specially when !VHE precludes the use of other techniques such
as sampling).

> Admittedly, I'm not at export in using perf, but I find this EL1/EL2
> distinction out of place as it relates to exlude_kernel, exlude_user,
> and exlude_hv.  Will we have a fourth Arm-specific flag which takes the
> place of exclude_hv on PowerPC, which excludes an underlying hypervisor
> when running a guest, should we ever support counting that in the
> future?
In all honestly, exclude_hv doesn't make much sense to me on a VHE
system, unless you define an arbitrary cutting point where things are on
one side or the other. As for a fourth flag, I have no idea.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-08 11:25         ` Andrew Murray
  2019-01-08 11:50           ` Marc Zyngier
@ 2019-01-08 12:14           ` Christoffer Dall
  2019-01-08 12:39             ` Andrew Murray
  1 sibling, 1 reply; 32+ messages in thread
From: Christoffer Dall @ 2019-01-08 12:14 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Jan 08, 2019 at 11:25:13AM +0000, Andrew Murray wrote:
> On Tue, Jan 08, 2019 at 11:18:43AM +0100, Christoffer Dall wrote:
> > On Fri, Jan 04, 2019 at 03:32:06PM +0000, Will Deacon wrote:
> > > On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > > > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > > > Add support for the :G and :H attributes in perf by handling the
> > > > > exclude_host/exclude_guest event attributes.
> > > > > 
> > > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > > as per the events exclude_host attribute.
> > > > > 
> > > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > > at EL2. We are able to eliminate counters counting host events on
> > > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > > on non-VHE then there is a small blackout window at the guest
> > > > > entry/exit where host events are not captured.
> > > > > 
> > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > > ---
> > > > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > > > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > > > 
> > > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > > index de564ae..4a3c73d 100644
> > > > > --- a/arch/arm64/kernel/perf_event.c
> > > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > > @@ -26,6 +26,7 @@
> > > > >  
> > > > >  #include <linux/acpi.h>
> > > > >  #include <linux/clocksource.h>
> > > > > +#include <linux/kvm_host.h>
> > > > >  #include <linux/of.h>
> > > > >  #include <linux/perf/arm_pmu.h>
> > > > >  #include <linux/platform_device.h>
> > > > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > > > >  
> > > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > > >  {
> > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > >  	int idx = event->hw.idx;
> > > > > +	int flags = 0;
> > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > >  
> > > > > -	armv8pmu_enable_counter(idx);
> > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > -		armv8pmu_enable_counter(idx - 1);
> > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > +
> > > > > +	if (!attr->exclude_host)
> > > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > > +	if (!attr->exclude_guest)
> > > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > > +
> > > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > > +
> > > > > +	if (!attr->exclude_host) {
> > > > > +		armv8pmu_enable_counter(idx);
> > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > +			armv8pmu_enable_counter(idx - 1);
> > > > > +	}
> > > > >  }
> > > > >  
> > > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > > >  {
> > > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > >  	int idx = hwc->idx;
> > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > >  
> > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > -		armv8pmu_disable_counter(idx - 1);
> > > > > -	armv8pmu_disable_counter(idx);
> > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > +
> > > > > +	kvm_clr_pmu_events(counter_bits);
> > > > > +
> > > > > +	if (!attr->exclude_host) {
> > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > +			armv8pmu_disable_counter(idx - 1);
> > > > > +		armv8pmu_disable_counter(idx);
> > > > > +	}
> > > > >  }
> > > > >  
> > > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > > >  	 * with other architectures (x86 and Power).
> > > > > +	 *
> > > > > +	 * To eliminate counting host events on the boundaries of
> > > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > > > +	 * with !exclude_host.
> > > > >  	 */
> > > > >  	if (is_kernel_in_hyp_mode()) {
> > > > > -		if (!attr->exclude_kernel)
> > > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > >  	} else {
> > > > > -		if (attr->exclude_kernel)
> > > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > > >  		if (!attr->exclude_hv)
> > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > 
> > > > I'm not sure about the current use of exclude_hv here.  The comment says
> > > > it's consistent with other architectures, but I can't find an example to
> > > > confirm this, and I don't think we have a comparable thing to the split
> > > > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > > 
> > > FWIW, that comment came from this thread:
> > > 
> > > http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503908.html
> > > 
> > > That was painful enough at the time, so I'd /really/ prefer not to change
> > > the semantics of this again if we can avoid it.
> > 
> > The comment makes sense for the is_kernel_in_hyp_mode() case.
> > 
> > However, for the !is_kernel_in_hyp_mode() case I can't see the current
> > behavior of exclude_hv being similar in other architectures.
> > 
> > I don't think the current semantics of excluding EL2 on a non-VHE host
> > system makes much sense, and I doubt anyone is using that for something
> > meaningful.  I think changing behavior for excldue_hv to depend on
> > is_hyp_mode_available rather than is_kernel_in_hyp_mode is the right
> > thing to do which would also align the semantics with other
> > architectures and between VHE and non-VHE.
> 
> Just for clarity, see below for the proposed patch - this disallows EL2
> counting for !VHE when we have the capability to be a KVM host.
> 

That was not what I meant.  I think you want to count EL1 and EL2
together on a non-VHE host system.

What I had in mind was more
something like the following (completely untested, of course):

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index e213f8e867f6..37648bedf8b0 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -948,6 +948,11 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
 	if (is_kernel_in_hyp_mode()) {
 		if (!attr->exclude_kernel)
 			config_base |= ARMV8_PMU_INCLUDE_EL2;
+	} else if (is_hyp_mode_available()) {
+		if (attr->exclude_kernel)
+			config_base |= ARMV8_PMU_EXCLUDE_EL1;
+		else
+			config_base |= ARMV8_PMU_INCLUDE_EL2;
 	} else {
 		if (attr->exclude_kernel)
 			config_base |= ARMV8_PMU_EXCLUDE_EL1;


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-08 12:12               ` Marc Zyngier
@ 2019-01-08 12:20                 ` Christoffer Dall
  0 siblings, 0 replies; 32+ messages in thread
From: Christoffer Dall @ 2019-01-08 12:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Julien Thierry, Catalin Marinas, joro,
	Suzuki K Poulose, Will Deacon, Andrew Murray, kvmarm,
	linux-arm-kernel

On Tue, Jan 08, 2019 at 12:12:13PM +0000, Marc Zyngier wrote:
> On 08/01/2019 12:03, Christoffer Dall wrote:
> > On Tue, Jan 08, 2019 at 11:50:59AM +0000, Marc Zyngier wrote:
> >> On Tue, 08 Jan 2019 11:25:13 +0000,
> >> Andrew Murray <andrew.murray@arm.com> wrote:
> >>
> >> Hi Andrew,
> >>
> >>> My only doubt about this is as follows. If, on a KVM host you run this:
> >>>
> >>> perf stat -e cycles:H lkvm run ...
> >>>
> >>> then on the VHE host the cycles reported represents the entire non-guest cycles
> >>> associated with running the guest.
> >>>
> >>> On a !VHE, the cycles reported exclude EL2 (with or without this patch) and
> >>> thus you don't get a representation of all the non-guest cycles associated with
> >>> the guest. However without this patch you could at least still run:
> >>>
> >>> perf stat -e cycles:H -e cycles:h lkvm run ...
> >>>
> >>> and then add the two cycle counts together to get something comparative with
> >>> the VHE host.
> >>>
> >>> If the above patch represents the desired semantics, then perhaps we must count
> >>> both EL1 and *EL2* for !exclude_kernel on !VHE. In fact I think we should do
> >>> this anyway and remove a little complexity from armv8pmu_set_event_filter.
> >>> Thoughts?
> >>
> >> I'm not sure we should hide the architectural differences between VHE
> >> and !VHE. If you're trying to measure what is happening at in the
> >> hypervisor, you can't reason about it while ignoring the dual nature
> >> of !VHE.
> >>
> > 
> > How do you define hypervisor here?  Is that just the code that runs at
> > EL2 or also parts of KVM that runs at EL1?
> 
> I define it as "not a guest". Whatever is used to support a guest is the
> hypervisor.
> 
> > It remains unclear to me why you'd want to measure a subset of KVM,
> > which happens to run in EL2, in your host (and hypervisor-enabled)
> > kernel, and you are even precluded from measuring a comparable portion
> > of your implementation on other Arm systems (VHE).
> 
> Because I'm not trying to compare apples (VHE) and oranges (!VHE). My
> use-case for perf is to measure the impact of a change on a given
> implementation, and the more I can narrow the impact of that change, the
> better (specially when !VHE precludes the use of other techniques such
> as sampling).
> 

Fair enough.  I don't know if that's the only use case for perf we
should consider though.

> > Admittedly, I'm not at export in using perf, but I find this EL1/EL2
> > distinction out of place as it relates to exlude_kernel, exlude_user,
> > and exlude_hv.  Will we have a fourth Arm-specific flag which takes the
> > place of exclude_hv on PowerPC, which excludes an underlying hypervisor
> > when running a guest, should we ever support counting that in the
> > future?
> In all honestly, exclude_hv doesn't make much sense to me on a VHE
> system, unless you define an arbitrary cutting point where things are on
> one side or the other. As for a fourth flag, I have no idea.
> 

I think this all boils down to how these flags are interpreted and
represented to a user via tooling.  If these flags must be considered
in complete isolation on a particular system and architecture, then
fine, we can define it as whatever we want, giving us a little bit more
insight on where things happen on a !VHE system.

If we care about these flags representing similar semantics to other
architectures, then I contend that we are abusing the exclude_hv flag
today, and exclude_hv should only ever have an effect when set within
a guest, not in a host.


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2019-01-08 12:14           ` Christoffer Dall
@ 2019-01-08 12:39             ` Andrew Murray
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Murray @ 2019-01-08 12:39 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Julien Thierry, Marc Zyngier, Catalin Marinas,
	joro, Suzuki K Poulose, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Jan 08, 2019 at 01:14:25PM +0100, Christoffer Dall wrote:
> On Tue, Jan 08, 2019 at 11:25:13AM +0000, Andrew Murray wrote:
> > On Tue, Jan 08, 2019 at 11:18:43AM +0100, Christoffer Dall wrote:
> > > On Fri, Jan 04, 2019 at 03:32:06PM +0000, Will Deacon wrote:
> > > > On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > > > > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > > > > Add support for the :G and :H attributes in perf by handling the
> > > > > > exclude_host/exclude_guest event attributes.
> > > > > > 
> > > > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > > > as per the events exclude_host attribute.
> > > > > > 
> > > > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > > > at EL2. We are able to eliminate counters counting host events on
> > > > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > > > on non-VHE then there is a small blackout window at the guest
> > > > > > entry/exit where host events are not captured.
> > > > > > 
> > > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > > > ---
> > > > > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > > > > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > > > index de564ae..4a3c73d 100644
> > > > > > --- a/arch/arm64/kernel/perf_event.c
> > > > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > > > @@ -26,6 +26,7 @@
> > > > > >  
> > > > > >  #include <linux/acpi.h>
> > > > > >  #include <linux/clocksource.h>
> > > > > > +#include <linux/kvm_host.h>
> > > > > >  #include <linux/of.h>
> > > > > >  #include <linux/perf/arm_pmu.h>
> > > > > >  #include <linux/platform_device.h>
> > > > > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > > > > >  
> > > > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > > > >  {
> > > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > > >  	int idx = event->hw.idx;
> > > > > > +	int flags = 0;
> > > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > > >  
> > > > > > -	armv8pmu_enable_counter(idx);
> > > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > > -		armv8pmu_enable_counter(idx - 1);
> > > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > > +
> > > > > > +	if (!attr->exclude_host)
> > > > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > > > +	if (!attr->exclude_guest)
> > > > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > > > +
> > > > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > > > +
> > > > > > +	if (!attr->exclude_host) {
> > > > > > +		armv8pmu_enable_counter(idx);
> > > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > > +			armv8pmu_enable_counter(idx - 1);
> > > > > > +	}
> > > > > >  }
> > > > > >  
> > > > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > > > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > > > >  {
> > > > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > > > +	struct perf_event_attr *attr = &event->attr;
> > > > > >  	int idx = hwc->idx;
> > > > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > > > >  
> > > > > >  	if (armv8pmu_event_is_chained(event))
> > > > > > -		armv8pmu_disable_counter(idx - 1);
> > > > > > -	armv8pmu_disable_counter(idx);
> > > > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > > > +
> > > > > > +	kvm_clr_pmu_events(counter_bits);
> > > > > > +
> > > > > > +	if (!attr->exclude_host) {
> > > > > > +		if (armv8pmu_event_is_chained(event))
> > > > > > +			armv8pmu_disable_counter(idx - 1);
> > > > > > +		armv8pmu_disable_counter(idx);
> > > > > > +	}
> > > > > >  }
> > > > > >  
> > > > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > > > >  	 * with other architectures (x86 and Power).
> > > > > > +	 *
> > > > > > +	 * To eliminate counting host events on the boundaries of
> > > > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > > > > +	 * with !exclude_host.
> > > > > >  	 */
> > > > > >  	if (is_kernel_in_hyp_mode()) {
> > > > > > -		if (!attr->exclude_kernel)
> > > > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > > >  	} else {
> > > > > > -		if (attr->exclude_kernel)
> > > > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > > > >  		if (!attr->exclude_hv)
> > > > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > > > 
> > > > > I'm not sure about the current use of exclude_hv here.  The comment says
> > > > > it's consistent with other architectures, but I can't find an example to
> > > > > confirm this, and I don't think we have a comparable thing to the split
> > > > > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > > > 
> > > > FWIW, that comment came from this thread:
> > > > 
> > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503908.html
> > > > 
> > > > That was painful enough at the time, so I'd /really/ prefer not to change
> > > > the semantics of this again if we can avoid it.
> > > 
> > > The comment makes sense for the is_kernel_in_hyp_mode() case.
> > > 
> > > However, for the !is_kernel_in_hyp_mode() case I can't see the current
> > > behavior of exclude_hv being similar in other architectures.
> > > 
> > > I don't think the current semantics of excluding EL2 on a non-VHE host
> > > system makes much sense, and I doubt anyone is using that for something
> > > meaningful.  I think changing behavior for excldue_hv to depend on
> > > is_hyp_mode_available rather than is_kernel_in_hyp_mode is the right
> > > thing to do which would also align the semantics with other
> > > architectures and between VHE and non-VHE.
> > 
> > Just for clarity, see below for the proposed patch - this disallows EL2
> > counting for !VHE when we have the capability to be a KVM host.
> > 
> 
> That was not what I meant.  I think you want to count EL1 and EL2
> together on a non-VHE host system.
> 
> What I had in mind was more
> something like the following (completely untested, of course):
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index e213f8e867f6..37648bedf8b0 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -948,6 +948,11 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  	if (is_kernel_in_hyp_mode()) {
>  		if (!attr->exclude_kernel)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> +	} else if (is_hyp_mode_available()) {
> +		if (attr->exclude_kernel)
> +			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> +		else
> +			config_base |= ARMV8_PMU_INCLUDE_EL2;

Right - so if we're on !VHE and we have the capability to be a KVM host (and
thus we're not a guest) then treat EL1/EL2 as the 'kernel' - this includes EL2
so includes the hypervisor overhead for any guests. This looks correct to me.

>  	} else {
>  		if (attr->exclude_kernel)
>  			config_base |= ARMV8_PMU_EXCLUDE_EL1;

And when we don't have the capability to be a KVM host then we don't care
about EL2 for exclude_kernel. This looks correct to me.

This proposed change also allows us to still count EL2 when we are not a KVM
host and when !exclude_hv is set. This makes sense as...

 - If we're a XEN guest or similar, we can use !exclude_hv to count EL2.

 - In the future we could support a KVM guest using exclude_hv to count
   the host kernel time pinned to the KVM process. The proposed change
   would attempt to count EL2 which virt/kvm/arm/pmu.c could treat as
   !exclude_kernel on the host (pinned to the KVM process).

Thanks,

Andrew Murray

> 
> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-01-08 12:39 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
2018-12-12 10:29 ` [PATCH v8 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
2018-12-12 10:29 ` [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
2018-12-12 10:37   ` Suzuki K Poulose
2018-12-12 12:31     ` Andrew Murray
2018-12-18 11:40   ` Christoffer Dall
2018-12-12 10:29 ` [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
2018-12-12 10:56   ` Suzuki K Poulose
2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
2018-12-12 10:42   ` Suzuki K Poulose
2018-12-12 10:47     ` Julien Thierry
2018-12-12 10:51       ` Suzuki K Poulose
2018-12-12 10:55   ` Suzuki K Poulose
2018-12-12 14:38     ` Andrew Murray
2018-12-18 12:02   ` Christoffer Dall
2018-12-18 13:25     ` Andrew Murray
2018-12-18 14:38       ` Christoffer Dall
2018-12-18 16:27         ` Andrew Murray
2018-12-18 18:51           ` Christoffer Dall
2018-12-18 19:19             ` Andrew Jones
2019-01-04 15:32     ` Will Deacon
2019-01-08 10:18       ` Christoffer Dall
2019-01-08 11:25         ` Andrew Murray
2019-01-08 11:50           ` Marc Zyngier
2019-01-08 12:03             ` Christoffer Dall
2019-01-08 12:12               ` Marc Zyngier
2019-01-08 12:20                 ` Christoffer Dall
2019-01-08 12:14           ` Christoffer Dall
2019-01-08 12:39             ` Andrew Murray
2018-12-12 10:29 ` [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
2018-12-12 10:53   ` Suzuki K Poulose
2018-12-12 14:46     ` Andrew Murray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).