* [RFC PATCH 0/7] Add support for monitoring guest TLB operations @ 2016-08-16 10:45 Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 1/7] perf/trace: Add notification for perf trace events Punit Agrawal ` (7 more replies) 0 siblings, 8 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Punit Agrawal, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon Hi, ARMv8 supports trapping guest TLB maintenance operations to the hypervisor. This trapping mechanism can be used to monitor the use of guest TLB instructions. As taking a trap for every TLB operation can have significant overhead, trapping should only be enabled - * on user request * for the VM of interest This patchset adds support to listen to perf trace event state change notifications. The notifications and associated context are then used to enable trapping of guest TLB operations when requested by the user. The trap handling generates trace events (kvm_tlb_invalidate) which can already be counted using existing perf trace functionality. Trapping of guest TLB operations is disabled when not being monitored (reducing profiling overhead). I would appreciate feedback on the approach to tie the control of TLB monitoring with perf trace events (Patch 1) especially if there are any suggestions on avoiding (or reducing) the overhead of "perf trace" notifications. I looked at using regfunc/unregfunc tracepoint hooks but they don't include the event context. But the bigger problem was that the callbacks are only called on the first instance of simultaneously executing perf stat invocations. The patchset is based on v4.8-rc2 and adds support for monitoring guest TLB operations on 64bit hosts. If the approach taken in the patches is acceptable, I'll add 32bit host support as well. With this patchset, 'perf' tool when attached to a VM process can be used to monitor the TLB operations. E.g., to monitor a VM with process id 4166 - # perf stat -e "kvm:kvm_tlb_invalidate" -p 4166 Perform some operations in VM (running 'make -j 7' on the kernel sources in this instance). Breaking out of perf shows - Performance counter stats for process id '4166': 7,471,974 kvm:kvm_tlb_invalidate 374.235405282 seconds time elapsed All feedback welcome. Thanks, Punit Mark Rutland (2): arm64: tlbflush.h: add __tlbi() macro arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal (5): perf/trace: Add notification for perf trace events KVM: Track the pid of the VM process KVM: arm/arm64: Register perf trace event notifier arm64: KVM: Handle trappable TLB instructions arm64: KVM: Enable selective trapping of TLB instructions arch/arm/include/asm/kvm_host.h | 3 + arch/arm/kvm/arm.c | 2 + arch/arm64/include/asm/kvm_asm.h | 1 + arch/arm64/include/asm/kvm_host.h | 8 ++ arch/arm64/include/asm/tlbflush.h | 31 ++++++-- arch/arm64/kvm/Kconfig | 4 + arch/arm64/kvm/Makefile | 1 + arch/arm64/kvm/hyp/tlb.c | 158 ++++++++++++++++++++++++++++++++++++-- arch/arm64/kvm/perf_trace.c | 154 +++++++++++++++++++++++++++++++++++++ arch/arm64/kvm/sys_regs.c | 81 +++++++++++++++++++ arch/arm64/kvm/trace.h | 16 ++++ include/linux/kvm_host.h | 1 + include/linux/trace_events.h | 3 + kernel/trace/trace_event_perf.c | 24 ++++++ virt/kvm/kvm_main.c | 2 + 15 files changed, 475 insertions(+), 14 deletions(-) create mode 100644 arch/arm64/kvm/perf_trace.c -- 2.8.1 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 1/7] perf/trace: Add notification for perf trace events 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal @ 2016-08-16 10:45 ` Punit Agrawal 2016-08-31 11:01 ` Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 2/7] KVM: Track the pid of the VM process Punit Agrawal ` (6 subsequent siblings) 7 siblings, 1 reply; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Punit Agrawal, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon Add a mechanism to notify listeners about perf trace event state changes. This enables listeners to take actions requiring the event context (e.g., attached process). The notification mechanism can be used to reduce trace point based profiling overhead by enabling/disabling hardware traps for specific contexts (e.g., virtual machines). Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> --- include/linux/trace_events.h | 3 +++ kernel/trace/trace_event_perf.c | 24 ++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index be00761..5924032 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -505,6 +505,9 @@ perf_trace_buf_submit(void *raw_data, int size, int rctx, u16 type, { perf_tp_event(type, count, raw_data, size, regs, head, rctx, task); } + +extern int perf_trace_notifier_register(struct notifier_block *nb); +extern int perf_trace_notifier_unregister(struct notifier_block *nb); #endif #endif /* _LINUX_TRACE_EVENT_H */ diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c index 562fa69..9aaaacf 100644 --- a/kernel/trace/trace_event_perf.c +++ b/kernel/trace/trace_event_perf.c @@ -6,10 +6,12 @@ */ #include <linux/module.h> +#include <linux/notifier.h> #include <linux/kprobes.h> #include "trace.h" static char __percpu *perf_trace_buf[PERF_NR_CONTEXTS]; +static RAW_NOTIFIER_HEAD(perf_trace_notifier_list); /* * Force it to be aligned to unsigned long to avoid misaligned accesses @@ -86,6 +88,26 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event, return 0; } +int perf_trace_notifier_register(struct notifier_block *nb) +{ + return raw_notifier_chain_register(&perf_trace_notifier_list, nb); +} + +int perf_trace_notifier_unregister(struct notifier_block *nb) +{ + return raw_notifier_chain_unregister(&perf_trace_notifier_list, nb); +} + +static void perf_trace_notify(enum trace_reg event, struct perf_event *p_event) +{ + /* + * We use raw notifiers here as we are called with the + * event_mutex held. + */ + raw_notifier_call_chain(&perf_trace_notifier_list, + event, p_event); +} + static int perf_trace_event_reg(struct trace_event_call *tp_event, struct perf_event *p_event) { @@ -176,6 +198,7 @@ out: static int perf_trace_event_open(struct perf_event *p_event) { struct trace_event_call *tp_event = p_event->tp_event; + perf_trace_notify(TRACE_REG_PERF_OPEN, p_event); return tp_event->class->reg(tp_event, TRACE_REG_PERF_OPEN, p_event); } @@ -183,6 +206,7 @@ static void perf_trace_event_close(struct perf_event *p_event) { struct trace_event_call *tp_event = p_event->tp_event; tp_event->class->reg(tp_event, TRACE_REG_PERF_CLOSE, p_event); + perf_trace_notify(TRACE_REG_PERF_CLOSE, p_event); } static int perf_trace_event_init(struct trace_event_call *tp_event, -- 2.8.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 1/7] perf/trace: Add notification for perf trace events 2016-08-16 10:45 ` [RFC PATCH 1/7] perf/trace: Add notification for perf trace events Punit Agrawal @ 2016-08-31 11:01 ` Punit Agrawal 0 siblings, 0 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-31 11:01 UTC (permalink / raw) To: linux-kernel Cc: kvm, kvmarm, linux-arm-kernel, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon Punit Agrawal <punit.agrawal@arm.com> writes: > Add a mechanism to notify listeners about perf trace event state > changes. This enables listeners to take actions requiring the event > context (e.g., attached process). > > The notification mechanism can be used to reduce trace point based > profiling overhead by enabling/disabling hardware traps for specific > contexts (e.g., virtual machines). > > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: Ingo Molnar <mingo@redhat.com> While I respin the series addressing comments on the arm64 architectural bits, I'd appreciate any feedback on this patch as it forms the basis of the rest of the series. Thanks, Punit > --- > include/linux/trace_events.h | 3 +++ > kernel/trace/trace_event_perf.c | 24 ++++++++++++++++++++++++ > 2 files changed, 27 insertions(+) > > diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h > index be00761..5924032 100644 > --- a/include/linux/trace_events.h > +++ b/include/linux/trace_events.h > @@ -505,6 +505,9 @@ perf_trace_buf_submit(void *raw_data, int size, int rctx, u16 type, > { > perf_tp_event(type, count, raw_data, size, regs, head, rctx, task); > } > + > +extern int perf_trace_notifier_register(struct notifier_block *nb); > +extern int perf_trace_notifier_unregister(struct notifier_block *nb); > #endif > > #endif /* _LINUX_TRACE_EVENT_H */ > diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c > index 562fa69..9aaaacf 100644 > --- a/kernel/trace/trace_event_perf.c > +++ b/kernel/trace/trace_event_perf.c > @@ -6,10 +6,12 @@ > */ > > #include <linux/module.h> > +#include <linux/notifier.h> > #include <linux/kprobes.h> > #include "trace.h" > > static char __percpu *perf_trace_buf[PERF_NR_CONTEXTS]; > +static RAW_NOTIFIER_HEAD(perf_trace_notifier_list); > > /* > * Force it to be aligned to unsigned long to avoid misaligned accesses > @@ -86,6 +88,26 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event, > return 0; > } > > +int perf_trace_notifier_register(struct notifier_block *nb) > +{ > + return raw_notifier_chain_register(&perf_trace_notifier_list, nb); > +} > + > +int perf_trace_notifier_unregister(struct notifier_block *nb) > +{ > + return raw_notifier_chain_unregister(&perf_trace_notifier_list, nb); > +} > + > +static void perf_trace_notify(enum trace_reg event, struct perf_event *p_event) > +{ > + /* > + * We use raw notifiers here as we are called with the > + * event_mutex held. > + */ > + raw_notifier_call_chain(&perf_trace_notifier_list, > + event, p_event); > +} > + > static int perf_trace_event_reg(struct trace_event_call *tp_event, > struct perf_event *p_event) > { > @@ -176,6 +198,7 @@ out: > static int perf_trace_event_open(struct perf_event *p_event) > { > struct trace_event_call *tp_event = p_event->tp_event; > + perf_trace_notify(TRACE_REG_PERF_OPEN, p_event); > return tp_event->class->reg(tp_event, TRACE_REG_PERF_OPEN, p_event); > } > > @@ -183,6 +206,7 @@ static void perf_trace_event_close(struct perf_event *p_event) > { > struct trace_event_call *tp_event = p_event->tp_event; > tp_event->class->reg(tp_event, TRACE_REG_PERF_CLOSE, p_event); > + perf_trace_notify(TRACE_REG_PERF_CLOSE, p_event); > } > > static int perf_trace_event_init(struct trace_event_call *tp_event, ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 2/7] KVM: Track the pid of the VM process 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 1/7] perf/trace: Add notification for perf trace events Punit Agrawal @ 2016-08-16 10:45 ` Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 3/7] KVM: arm/arm64: Register perf trace event notifier Punit Agrawal ` (5 subsequent siblings) 7 siblings, 0 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Punit Agrawal, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon, Paolo Bonzini, Radim Krčmář Userspace tools such as perf can be used to profile individual processes. Track the PID of the virtual machine process to match profiling requests targeted at it. This can be used to take appropriate action to enable the requested profiling operations for the VMs of interest. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9c28b4d..7c42c94 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -374,6 +374,7 @@ struct kvm_memslots { struct kvm { spinlock_t mmu_lock; struct mutex slots_lock; + struct pid *pid; struct mm_struct *mm; /* userspace tied to this vm */ struct kvm_memslots *memslots[KVM_ADDRESS_SPACE_NUM]; struct srcu_struct srcu; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1950782..ab2535a 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -613,6 +613,7 @@ static struct kvm *kvm_create_vm(unsigned long type) spin_lock_init(&kvm->mmu_lock); atomic_inc(¤t->mm->mm_count); kvm->mm = current->mm; + kvm->pid = get_task_pid(current, PIDTYPE_PID); kvm_eventfd_init(kvm); mutex_init(&kvm->lock); mutex_init(&kvm->irq_lock); @@ -712,6 +713,7 @@ static void kvm_destroy_vm(struct kvm *kvm) int i; struct mm_struct *mm = kvm->mm; + put_pid(kvm->pid); kvm_destroy_vm_debugfs(kvm); kvm_arch_sync_events(kvm); spin_lock(&kvm_lock); -- 2.8.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [RFC PATCH 3/7] KVM: arm/arm64: Register perf trace event notifier 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 1/7] perf/trace: Add notification for perf trace events Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 2/7] KVM: Track the pid of the VM process Punit Agrawal @ 2016-08-16 10:45 ` Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 4/7] arm64: tlbflush.h: add __tlbi() macro Punit Agrawal ` (4 subsequent siblings) 7 siblings, 0 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Punit Agrawal, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon Register a notifier to track state changes of perf trace events. The notifier will enable taking appropriate action for trace events targeting VM. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> --- arch/arm/include/asm/kvm_host.h | 3 + arch/arm/kvm/arm.c | 2 + arch/arm64/include/asm/kvm_host.h | 8 +++ arch/arm64/kvm/Kconfig | 4 ++ arch/arm64/kvm/Makefile | 1 + arch/arm64/kvm/perf_trace.c | 122 ++++++++++++++++++++++++++++++++++++++ 6 files changed, 140 insertions(+) create mode 100644 arch/arm64/kvm/perf_trace.c diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index de338d9..609998e 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -280,6 +280,9 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext) int kvm_perf_init(void); int kvm_perf_teardown(void); +static inline int kvm_perf_trace_init(void) { return 0; } +static inline int kvm_perf_trace_teardown(void) { return 0; } + void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot); struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 75f130e..e1b99c4 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -1220,6 +1220,7 @@ static int init_subsystems(void) goto out; kvm_perf_init(); + kvm_perf_trace_init(); kvm_coproc_table_init(); out: @@ -1411,6 +1412,7 @@ out_err: void kvm_arch_exit(void) { kvm_perf_teardown(); + kvm_perf_trace_teardown(); } static int arm_init(void) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3eda975..f6ff8e5 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -345,6 +345,14 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run, int kvm_perf_init(void); int kvm_perf_teardown(void); +#if !defined(CONFIG_KVM_PERF_TRACE) +static inline int kvm_perf_trace_init(void) { return 0; } +static inline int kvm_perf_trace_teardown(void) { return 0; } +#else +int kvm_perf_trace_init(void); +int kvm_perf_trace_teardown(void); +#endif + struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr, diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 9c9edc9..56e9537 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -19,6 +19,9 @@ if VIRTUALIZATION config KVM_ARM_VGIC_V3 bool +config KVM_PERF_TRACE + bool + config KVM bool "Kernel-based Virtual Machine (KVM) support" depends on OF @@ -39,6 +42,7 @@ config KVM select HAVE_KVM_MSI select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQ_ROUTING + select KVM_PERF_TRACE if EVENT_TRACING && PERF_EVENTS ---help--- Support hosting virtualized guest machines. We don't support KVM with 16K page tables yet, due to the multiple diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 695eb3c..7d175e4 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -19,6 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o kvm-$(CONFIG_KVM_ARM_HOST) += emulate.o inject_fault.o regmap.o kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o +kvm-$(CONFIG_KVM_PERF_TRACE) += perf_trace.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-init.o diff --git a/arch/arm64/kvm/perf_trace.c b/arch/arm64/kvm/perf_trace.c new file mode 100644 index 0000000..8bacd18 --- /dev/null +++ b/arch/arm64/kvm/perf_trace.c @@ -0,0 +1,122 @@ +/* + * Copyright (C) 2016 ARM Ltd. + * Author: Punit Agrawal <punit.agrawal@arm.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see <http://www.gnu.org/licenses/>. + */ +#include <linux/kvm_host.h> +#include <linux/trace_events.h> + +typedef int (*perf_trace_callback_fn)(struct kvm *kvm, bool enable); + +struct kvm_trace_hook { + char *key; + perf_trace_callback_fn setup_fn; +}; + +static struct kvm_trace_hook trace_hook[] = { + { }, +}; + +static perf_trace_callback_fn find_trace_callback(const char *trace_key) +{ + int i; + + for (i = 0; trace_hook[i].key; i++) + if (!strcmp(trace_key, trace_hook[i].key)) + return trace_hook[i].setup_fn; + + return NULL; +} + +static int kvm_perf_trace_notifier(struct notifier_block *nb, + unsigned long event, void *data) +{ + struct perf_event *p_event = data; + struct trace_event_call *tp_event = p_event->tp_event; + perf_trace_callback_fn setup_trace_fn; + struct kvm *kvm = NULL; + struct pid *pid; + bool found = false; + + /* + * Is this a trace point? + */ + if (!(tp_event->flags & TRACE_EVENT_FL_TRACEPOINT)) + goto out; + + /* + * We'll get here for events we care to monitor for KVM. As we + * only care about events attached to a VM, check that there + * is a task associated with the perf event. + */ + if (p_event->attach_state != PERF_ATTACH_TASK) + goto out; + + /* + * This notifier gets called when perf trace event instance is + * added or removed. Until we can restrict this to events of + * interest in core, minimise the overhead below. + * + * Do we care about it? i.e., is there a callback for this + * trace point? + */ + setup_trace_fn = find_trace_callback(tp_event->tp->name); + if (!setup_trace_fn) + goto out; + + pid = get_task_pid(p_event->hw.target, PIDTYPE_PID); + + /* + * Does it match any of the VMs? + */ + spin_lock(&kvm_lock); + list_for_each_entry(kvm, &vm_list, vm_list) { + if (kvm->pid == pid) { + found = true; + break; + } + } + spin_unlock(&kvm_lock); + + put_pid(pid); + if (!found) + goto out; + + switch (event) { + case TRACE_REG_PERF_OPEN: + setup_trace_fn(kvm, true); + break; + + case TRACE_REG_PERF_CLOSE: + setup_trace_fn(kvm, false); + break; + } + +out: + return 0; +} + +static struct notifier_block kvm_perf_trace_notifier_block = { + .notifier_call = kvm_perf_trace_notifier, +}; + +int kvm_perf_trace_init(void) +{ + return perf_trace_notifier_register(&kvm_perf_trace_notifier_block); +} + +int kvm_perf_trace_teardown(void) +{ + return perf_trace_notifier_unregister(&kvm_perf_trace_notifier_block); +} -- 2.8.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [RFC PATCH 4/7] arm64: tlbflush.h: add __tlbi() macro 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal ` (2 preceding siblings ...) 2016-08-16 10:45 ` [RFC PATCH 3/7] KVM: arm/arm64: Register perf trace event notifier Punit Agrawal @ 2016-08-16 10:45 ` Punit Agrawal 2016-08-19 13:24 ` Will Deacon 2016-08-16 10:45 ` [RFC PATCH 5/7] arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal ` (3 subsequent siblings) 7 siblings, 1 reply; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Mark Rutland, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon, Catalin Marinas, Punit Agrawal From: Mark Rutland <mark.rutland@arm.com> As with dsb() and isb(), add a __tbli() helper so that we can avoid distracting asm boilerplate every time we want a TLBI. As some TLBI operations take an argument while others do not, some pre-processor is used to handle these two cases with different assembly blocks. The existing tlbflush.h code is moved over to use the helper. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ rename helper to __tlbi, update commit log ] Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> --- arch/arm64/include/asm/tlbflush.h | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index b460ae2..d57a0be 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -25,6 +25,21 @@ #include <asm/cputype.h> /* + * Raw TLBI operations. Drivers and most kernel code should use the TLB + * management routines below in preference to these. Where necessary, these can + * be used to avoid asm() boilerplate. + * + * Can be used as __tlbi(op) or __tlbi(op, arg), depending on whether a + * particular TLBI op takes an argument or not. The macros below handle invoking + * the asm with or without the register argument as appropriate. + */ +#define TLBI_0(op, arg) asm ("tlbi " #op) +#define TLBI_1(op, arg) asm ("tlbi " #op ", %0" : : "r" (arg)) +#define TLBI_N(op, arg, n, ...) TLBI_##n(op, arg) + +#define __tlbi(op, ...) TLBI_N(op, ##__VA_ARGS__, 1, 0) + +/* * TLB Management * ============== * @@ -66,7 +81,7 @@ static inline void local_flush_tlb_all(void) { dsb(nshst); - asm("tlbi vmalle1"); + __tlbi(vmalle1); dsb(nsh); isb(); } @@ -74,7 +89,7 @@ static inline void local_flush_tlb_all(void) static inline void flush_tlb_all(void) { dsb(ishst); - asm("tlbi vmalle1is"); + __tlbi(vmalle1is); dsb(ish); isb(); } @@ -84,7 +99,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm) unsigned long asid = ASID(mm) << 48; dsb(ishst); - asm("tlbi aside1is, %0" : : "r" (asid)); + __tlbi(aside1is, asid); dsb(ish); } @@ -94,7 +109,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr = uaddr >> 12 | (ASID(vma->vm_mm) << 48); dsb(ishst); - asm("tlbi vale1is, %0" : : "r" (addr)); + __tlbi(vale1is, addr); dsb(ish); } @@ -122,9 +137,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, dsb(ishst); for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) { if (last_level) - asm("tlbi vale1is, %0" : : "r"(addr)); + __tlbi(vale1is, addr); else - asm("tlbi vae1is, %0" : : "r"(addr)); + __tlbi(vae1is, addr); } dsb(ish); } @@ -149,7 +164,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end dsb(ishst); for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) - asm("tlbi vaae1is, %0" : : "r"(addr)); + __tlbi(vaae1is, addr); dsb(ish); isb(); } @@ -163,7 +178,7 @@ static inline void __flush_tlb_pgtable(struct mm_struct *mm, { unsigned long addr = uaddr >> 12 | (ASID(mm) << 48); - asm("tlbi vae1is, %0" : : "r" (addr)); + __tlbi(vae1is, addr); dsb(ish); } -- 2.8.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 4/7] arm64: tlbflush.h: add __tlbi() macro 2016-08-16 10:45 ` [RFC PATCH 4/7] arm64: tlbflush.h: add __tlbi() macro Punit Agrawal @ 2016-08-19 13:24 ` Will Deacon 2016-08-19 13:34 ` Punit Agrawal 0 siblings, 1 reply; 22+ messages in thread From: Will Deacon @ 2016-08-19 13:24 UTC (permalink / raw) To: Punit Agrawal Cc: linux-kernel, kvm, kvmarm, linux-arm-kernel, Mark Rutland, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Catalin Marinas On Tue, Aug 16, 2016 at 11:45:09AM +0100, Punit Agrawal wrote: > From: Mark Rutland <mark.rutland@arm.com> > > As with dsb() and isb(), add a __tbli() helper so that we can avoid Minor typo: s/__tbli/__tlbi/ > distracting asm boilerplate every time we want a TLBI. As some TLBI > operations take an argument while others do not, some pre-processor is > used to handle these two cases with different assembly blocks. > > The existing tlbflush.h code is moved over to use the helper. > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Marc Zyngier <marc.zyngier@arm.com> > Cc: Will Deacon <will.deacon@arm.com> > [ rename helper to __tlbi, update commit log ] > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > --- > arch/arm64/include/asm/tlbflush.h | 31 +++++++++++++++++++++++-------- > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > index b460ae2..d57a0be 100644 > --- a/arch/arm64/include/asm/tlbflush.h > +++ b/arch/arm64/include/asm/tlbflush.h > @@ -25,6 +25,21 @@ > #include <asm/cputype.h> > > /* > + * Raw TLBI operations. Drivers and most kernel code should use the TLB > + * management routines below in preference to these. Where necessary, these can > + * be used to avoid asm() boilerplate. > + * > + * Can be used as __tlbi(op) or __tlbi(op, arg), depending on whether a > + * particular TLBI op takes an argument or not. The macros below handle invoking > + * the asm with or without the register argument as appropriate. > + */ > +#define TLBI_0(op, arg) asm ("tlbi " #op) > +#define TLBI_1(op, arg) asm ("tlbi " #op ", %0" : : "r" (arg)) > +#define TLBI_N(op, arg, n, ...) TLBI_##n(op, arg) Should this be prefixed with underscores, too? Will ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 4/7] arm64: tlbflush.h: add __tlbi() macro 2016-08-19 13:24 ` Will Deacon @ 2016-08-19 13:34 ` Punit Agrawal 0 siblings, 0 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-19 13:34 UTC (permalink / raw) To: Will Deacon Cc: kvm, Marc Zyngier, Catalin Marinas, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel Will Deacon <will.deacon@arm.com> writes: > On Tue, Aug 16, 2016 at 11:45:09AM +0100, Punit Agrawal wrote: >> From: Mark Rutland <mark.rutland@arm.com> >> >> As with dsb() and isb(), add a __tbli() helper so that we can avoid > > Minor typo: s/__tbli/__tlbi/ Thanks for spotting. I've fixed this locally now. > >> distracting asm boilerplate every time we want a TLBI. As some TLBI >> operations take an argument while others do not, some pre-processor is >> used to handle these two cases with different assembly blocks. >> >> The existing tlbflush.h code is moved over to use the helper. >> >> Signed-off-by: Mark Rutland <mark.rutland@arm.com> >> Cc: Catalin Marinas <catalin.marinas@arm.com> >> Cc: Marc Zyngier <marc.zyngier@arm.com> >> Cc: Will Deacon <will.deacon@arm.com> >> [ rename helper to __tlbi, update commit log ] >> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> >> --- >> arch/arm64/include/asm/tlbflush.h | 31 +++++++++++++++++++++++-------- >> 1 file changed, 23 insertions(+), 8 deletions(-) >> >> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h >> index b460ae2..d57a0be 100644 >> --- a/arch/arm64/include/asm/tlbflush.h >> +++ b/arch/arm64/include/asm/tlbflush.h >> @@ -25,6 +25,21 @@ >> #include <asm/cputype.h> >> >> /* >> + * Raw TLBI operations. Drivers and most kernel code should use the TLB >> + * management routines below in preference to these. Where necessary, these can >> + * be used to avoid asm() boilerplate. >> + * >> + * Can be used as __tlbi(op) or __tlbi(op, arg), depending on whether a >> + * particular TLBI op takes an argument or not. The macros below handle invoking >> + * the asm with or without the register argument as appropriate. >> + */ >> +#define TLBI_0(op, arg) asm ("tlbi " #op) >> +#define TLBI_1(op, arg) asm ("tlbi " #op ", %0" : : "r" (arg)) >> +#define TLBI_N(op, arg, n, ...) TLBI_##n(op, arg) > > Should this be prefixed with underscores, too? As these were only used in the definition of __tlbi() I didn't prefix them. I'll add them for the next posting. Thanks for taking a look. Punit > > Will > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 5/7] arm64/kvm: hyp: tlb: use __tlbi() helper 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal ` (3 preceding siblings ...) 2016-08-16 10:45 ` [RFC PATCH 4/7] arm64: tlbflush.h: add __tlbi() macro Punit Agrawal @ 2016-08-16 10:45 ` Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions Punit Agrawal ` (2 subsequent siblings) 7 siblings, 0 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Mark Rutland, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon, Punit Agrawal From: Mark Rutland <mark.rutland@arm.com> Now that we have a __tlbi() helper, make use of this in the arm64 KVM hyp code to get rid of asm() boilerplate. At the same time, we simplify __tlb_flush_vm_context by using __flush_icache_all(), as this has the appropriate instruction cache maintenance and barrier. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> [ rename tlbi -> __tlbi, convert additional sites, update commit log ] Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> --- arch/arm64/kvm/hyp/tlb.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c index be8177c..4cda100 100644 --- a/arch/arm64/kvm/hyp/tlb.c +++ b/arch/arm64/kvm/hyp/tlb.c @@ -16,6 +16,7 @@ */ #include <asm/kvm_hyp.h> +#include <asm/tlbflush.h> static void __hyp_text __tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) { @@ -32,7 +33,7 @@ static void __hyp_text __tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) * whole of Stage-1. Weep... */ ipa >>= 12; - asm volatile("tlbi ipas2e1is, %0" : : "r" (ipa)); + __tlbi(ipas2e1is, ipa); /* * We have to ensure completion of the invalidation at Stage-2, @@ -41,7 +42,7 @@ static void __hyp_text __tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) * the Stage-1 invalidation happened first. */ dsb(ish); - asm volatile("tlbi vmalle1is" : : ); + __tlbi(vmalle1is); dsb(ish); isb(); @@ -60,7 +61,7 @@ static void __hyp_text __tlb_flush_vmid(struct kvm *kvm) write_sysreg(kvm->arch.vttbr, vttbr_el2); isb(); - asm volatile("tlbi vmalls12e1is" : : ); + __tlbi(vmalls12e1is); dsb(ish); isb(); @@ -72,9 +73,8 @@ __alias(__tlb_flush_vmid) void __kvm_tlb_flush_vmid(struct kvm *kvm); static void __hyp_text __tlb_flush_vm_context(void) { dsb(ishst); - asm volatile("tlbi alle1is \n" - "ic ialluis ": : ); - dsb(ish); + __tlbi(alle1is); + __flush_icache_all(); /* contains a dsb(ish) */ } __alias(__tlb_flush_vm_context) void __kvm_flush_vm_context(void); -- 2.8.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal ` (4 preceding siblings ...) 2016-08-16 10:45 ` [RFC PATCH 5/7] arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal @ 2016-08-16 10:45 ` Punit Agrawal 2016-08-19 15:18 ` Will Deacon 2016-08-16 10:45 ` [RFC PATCH 7/7] arm64: KVM: Enable selective trapping of " Punit Agrawal 2016-08-17 15:58 ` [RFC PATCH 0/7] Add support for monitoring guest TLB operations Paolo Bonzini 7 siblings, 1 reply; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Punit Agrawal, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon The ARMv8 architecture allows trapping of TLB maintenane instructions from EL0/EL1 to higher exception levels. On encountering a trappable TLB instruction in a guest, an exception is taken to EL2. Add functionality to handle emulating the TLB instructions. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> --- arch/arm64/include/asm/kvm_asm.h | 1 + arch/arm64/kvm/hyp/tlb.c | 146 +++++++++++++++++++++++++++++++++++++++ arch/arm64/kvm/sys_regs.c | 81 ++++++++++++++++++++++ arch/arm64/kvm/trace.h | 16 +++++ 4 files changed, 244 insertions(+) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 7561f63..1ac1cc3 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -49,6 +49,7 @@ extern char __kvm_hyp_vector[]; extern void __kvm_flush_vm_context(void); extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa); extern void __kvm_tlb_flush_vmid(struct kvm *kvm); +extern void __kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 sysreg, u64 regval); extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c index 4cda100..e0a0309 100644 --- a/arch/arm64/kvm/hyp/tlb.c +++ b/arch/arm64/kvm/hyp/tlb.c @@ -78,3 +78,149 @@ static void __hyp_text __tlb_flush_vm_context(void) } __alias(__tlb_flush_vm_context) void __kvm_flush_vm_context(void); + +/* Intentionally empty functions */ +static void __hyp_text __switch_to_hyp_role_nvhe(void) { } +static void __hyp_text __switch_to_host_role_nvhe(void) { } + +static void __hyp_text __switch_to_hyp_role_vhe(void) +{ + u64 hcr = read_sysreg(hcr_el2); + + hcr &= ~HCR_TGE; + write_sysreg(hcr, hcr_el2); +} + +static void __hyp_text __switch_to_host_role_vhe(void) +{ + u64 hcr = read_sysreg(hcr_el2); + + hcr |= HCR_TGE; + write_sysreg(hcr, hcr_el2); +} + +static hyp_alternate_select(__switch_to_hyp_role, + __switch_to_hyp_role_nvhe, + __switch_to_hyp_role_vhe, + ARM64_HAS_VIRT_HOST_EXTN); + +static hyp_alternate_select(__switch_to_host_role, + __switch_to_host_role_nvhe, + __switch_to_host_role_vhe, + ARM64_HAS_VIRT_HOST_EXTN); + +static void __hyp_text __switch_to_guest_regime(struct kvm *kvm) +{ + write_sysreg(kvm->arch.vttbr, vttbr_el2); + __switch_to_hyp_role(); + isb(); +} + +static void __hyp_text __switch_to_host_regime(void) +{ + __switch_to_host_role(); + write_sysreg(0, vttbr_el2); +} + +/* + * AArch32 TLB maintenance instructions trapping to EL2 + */ +#define TLBIALLIS sys_reg(0, 0, 8, 3, 0) +#define TLBIMVAIS sys_reg(0, 0, 8, 3, 1) +#define TLBIASIDIS sys_reg(0, 0, 8, 3, 2) +#define TLBIMVAAIS sys_reg(0, 0, 8, 3, 3) +#define TLBIMVALIS sys_reg(0, 0, 8, 3, 5) +#define TLBIMVAALIS sys_reg(0, 0, 8, 3, 7) +#define ITLBIALL sys_reg(0, 0, 8, 5, 0) +#define ITLBIMVA sys_reg(0, 0, 8, 5, 1) +#define ITLBIASID sys_reg(0, 0, 8, 5, 2) +#define DTLBIALL sys_reg(0, 0, 8, 6, 0) +#define DTLBIMVA sys_reg(0, 0, 8, 6, 1) +#define DTLBIASID sys_reg(0, 0, 8, 6, 2) +#define TLBIALL sys_reg(0, 0, 8, 7, 0) +#define TLBIMVA sys_reg(0, 0, 8, 7, 1) +#define TLBIASID sys_reg(0, 0, 8, 7, 2) +#define TLBIMVAA sys_reg(0, 0, 8, 7, 3) +#define TLBIMVAL sys_reg(0, 0, 8, 7, 5) +#define TLBIMVAAL sys_reg(0, 0, 8, 7, 7) + +/* + * ARMv8 ARM: Table C5-4 TLB maintenance instructions + * (Ref: ARMv8 ARM C5.1 version: ARM DDI 0487A.j) + */ +#define TLBI_VMALLE1IS sys_reg(1, 0, 8, 3, 0) +#define TLBI_VAE1IS sys_reg(1, 0, 8, 3, 1) +#define TLBI_ASIDE1IS sys_reg(1, 0, 8, 3, 2) +#define TLBI_VAAE1IS sys_reg(1, 0, 8, 3, 3) +#define TLBI_VALE1IS sys_reg(1, 0, 8, 3, 5) +#define TLBI_VAALE1IS sys_reg(1, 0, 8, 3, 7) +#define TLBI_VMALLE1 sys_reg(1, 0, 8, 7, 0) +#define TLBI_VAE1 sys_reg(1, 0, 8, 7, 1) +#define TLBI_ASIDE1 sys_reg(1, 0, 8, 7, 2) +#define TLBI_VAAE1 sys_reg(1, 0, 8, 7, 3) +#define TLBI_VALE1 sys_reg(1, 0, 8, 7, 5) +#define TLBI_VAALE1 sys_reg(1, 0, 8, 7, 7) + +void __hyp_text +__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 sys_op, u64 regval) +{ + kvm = kern_hyp_va(kvm); + + /* + * Switch to the guest before performing any TLB operations to + * target the appropriate VMID + */ + __switch_to_guest_regime(kvm); + + /* + * TLB maintenance operations broadcast to inner-shareable + * domain when HCR_FB is set (default for KVM). + */ + switch (sys_op) { + case TLBIALL: + case TLBIALLIS: + case ITLBIALL: + case DTLBIALL: + case TLBI_VMALLE1: + case TLBI_VMALLE1IS: + __tlbi(vmalle1is); + break; + case TLBIMVA: + case TLBIMVAIS: + case ITLBIMVA: + case DTLBIMVA: + case TLBI_VAE1: + case TLBI_VAE1IS: + __tlbi(vae1is, regval); + break; + case TLBIASID: + case TLBIASIDIS: + case ITLBIASID: + case DTLBIASID: + case TLBI_ASIDE1: + case TLBI_ASIDE1IS: + __tlbi(aside1is, regval); + break; + case TLBIMVAA: + case TLBIMVAAIS: + case TLBI_VAAE1: + case TLBI_VAAE1IS: + __tlbi(vaae1is, regval); + break; + case TLBIMVAL: + case TLBIMVALIS: + case TLBI_VALE1: + case TLBI_VALE1IS: + __tlbi(vale1is, regval); + break; + case TLBIMVAAL: + case TLBIMVAALIS: + case TLBI_VAALE1: + case TLBI_VAALE1IS: + __tlbi(vaale1is, regval); + break; + } + isb(); + + __switch_to_host_regime(); +} diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index b0b225c..ca0b80f 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -790,6 +790,18 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, return true; } +static bool emulate_tlb_invalidate(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + u32 opcode = sys_reg(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2); + + kvm_call_hyp(__kvm_emulate_tlb_invalidate, + vcpu->kvm, opcode, p->regval); + trace_kvm_tlb_invalidate(*vcpu_pc(vcpu), opcode); + + return true; +} + /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */ #define DBG_BCR_BVR_WCR_WVR_EL1(n) \ /* DBGBVRn_EL1 */ \ @@ -849,6 +861,35 @@ static const struct sys_reg_desc sys_reg_descs[] = { { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010), access_dcsw }, + /* + * ARMv8 ARM: Table C5-4 TLB maintenance instructions + * (Ref: ARMv8 ARM C5.1 version: ARM DDI 0487A.j) + */ + /* TLBI VMALLE1IS */ + { Op0(1), Op1(0), CRn(8), CRm(3), Op2(0), emulate_tlb_invalidate }, + /* TLBI VAE1IS */ + { Op0(1), Op1(0), CRn(8), CRm(3), Op2(1), emulate_tlb_invalidate }, + /* TLBI ASIDE1IS */ + { Op0(1), Op1(0), CRn(8), CRm(3), Op2(2), emulate_tlb_invalidate }, + /* TLBI VAAE1IS */ + { Op0(1), Op1(0), CRn(8), CRm(3), Op2(3), emulate_tlb_invalidate }, + /* TLBI VALE1IS */ + { Op0(1), Op1(0), CRn(8), CRm(3), Op2(5), emulate_tlb_invalidate }, + /* TLBI VAALE1IS */ + { Op0(1), Op1(0), CRn(8), CRm(3), Op2(7), emulate_tlb_invalidate }, + /* TLBI VMALLE1 */ + { Op0(1), Op1(0), CRn(8), CRm(7), Op2(0), emulate_tlb_invalidate }, + /* TLBI VAE1 */ + { Op0(1), Op1(0), CRn(8), CRm(7), Op2(1), emulate_tlb_invalidate }, + /* TLBI ASIDE1 */ + { Op0(1), Op1(0), CRn(8), CRm(7), Op2(2), emulate_tlb_invalidate }, + /* TLBI VAAE1 */ + { Op0(1), Op1(0), CRn(8), CRm(7), Op2(3), emulate_tlb_invalidate }, + /* TLBI VALE1 */ + { Op0(1), Op1(0), CRn(8), CRm(7), Op2(5), emulate_tlb_invalidate }, + /* TLBI VAALE1 */ + { Op0(1), Op1(0), CRn(8), CRm(7), Op2(7), emulate_tlb_invalidate }, + DBG_BCR_BVR_WCR_WVR_EL1(0), DBG_BCR_BVR_WCR_WVR_EL1(1), /* MDCCINT_EL1 */ @@ -1337,6 +1378,46 @@ static const struct sys_reg_desc cp15_regs[] = { { Op1( 0), CRn( 7), CRm(10), Op2( 2), access_dcsw }, { Op1( 0), CRn( 7), CRm(14), Op2( 2), access_dcsw }, + /* + * TLB operations + */ + /* TLBIALLIS */ + { Op1( 0), CRn( 8), CRm( 3), Op2( 0), emulate_tlb_invalidate}, + /* TLBIMVAIS */ + { Op1( 0), CRn( 8), CRm( 3), Op2( 1), emulate_tlb_invalidate}, + /* TLBIASIDIS */ + { Op1( 0), CRn( 8), CRm( 3), Op2( 2), emulate_tlb_invalidate}, + /* TLBIMVAAIS */ + { Op1( 0), CRn( 8), CRm( 3), Op2( 3), emulate_tlb_invalidate}, + /* TLBIMVALIS */ + { Op1( 0), CRn( 8), CRm( 3), Op2( 5), emulate_tlb_invalidate}, + /* TLBIMVAALIS */ + { Op1( 0), CRn( 8), CRm( 3), Op2( 7), emulate_tlb_invalidate}, + /* ITLBIALL */ + { Op1( 0), CRn( 8), CRm( 5), Op2( 0), emulate_tlb_invalidate}, + /* ITLBIMVA */ + { Op1( 0), CRn( 8), CRm( 5), Op2( 1), emulate_tlb_invalidate}, + /* ITLBIASID */ + { Op1( 0), CRn( 8), CRm( 5), Op2( 2), emulate_tlb_invalidate}, + /* DTLBIALL */ + { Op1( 0), CRn( 8), CRm( 6), Op2( 0), emulate_tlb_invalidate}, + /* DTLBIMVA */ + { Op1( 0), CRn( 8), CRm( 6), Op2( 1), emulate_tlb_invalidate}, + /* DTLBIASID */ + { Op1( 0), CRn( 8), CRm( 6), Op2( 2), emulate_tlb_invalidate}, + /* TLBIALL */ + { Op1( 0), CRn( 8), CRm( 7), Op2( 0), emulate_tlb_invalidate}, + /* TLBIMVA */ + { Op1( 0), CRn( 8), CRm( 7), Op2( 1), emulate_tlb_invalidate}, + /* TLBIASID */ + { Op1( 0), CRn( 8), CRm( 7), Op2( 2), emulate_tlb_invalidate}, + /* TLBIMVAA */ + { Op1( 0), CRn( 8), CRm( 7), Op2( 3), emulate_tlb_invalidate}, + /* TLBIMVAL */ + { Op1( 0), CRn( 8), CRm( 7), Op2( 5), emulate_tlb_invalidate}, + /* TLBIMVAAL */ + { Op1( 0), CRn( 8), CRm( 7), Op2( 7), emulate_tlb_invalidate}, + /* PMU */ { Op1( 0), CRn( 9), CRm(12), Op2( 0), access_pmcr }, { Op1( 0), CRn( 9), CRm(12), Op2( 1), access_pmcnten }, diff --git a/arch/arm64/kvm/trace.h b/arch/arm64/kvm/trace.h index 7fb0008..c4d577f 100644 --- a/arch/arm64/kvm/trace.h +++ b/arch/arm64/kvm/trace.h @@ -166,6 +166,22 @@ TRACE_EVENT(kvm_set_guest_debug, TP_printk("vcpu: %p, flags: 0x%08x", __entry->vcpu, __entry->guest_debug) ); +TRACE_EVENT(kvm_tlb_invalidate, + TP_PROTO(unsigned long vcpu_pc, u32 opcode), + TP_ARGS(vcpu_pc, opcode), + + TP_STRUCT__entry( + __field(unsigned long, vcpu_pc) + __field(u32, opcode) + ), + + TP_fast_assign( + __entry->vcpu_pc = vcpu_pc; + __entry->opcode = opcode; + ), + + TP_printk("vcpu_pc=0x%16lx opcode=%08x", __entry->vcpu_pc, __entry->opcode) +); #endif /* _TRACE_ARM64_KVM_H */ -- 2.8.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions 2016-08-16 10:45 ` [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions Punit Agrawal @ 2016-08-19 15:18 ` Will Deacon 2016-08-24 10:40 ` Punit Agrawal 0 siblings, 1 reply; 22+ messages in thread From: Will Deacon @ 2016-08-19 15:18 UTC (permalink / raw) To: Punit Agrawal Cc: linux-kernel, kvm, kvmarm, linux-arm-kernel, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar Hi Punit, On Tue, Aug 16, 2016 at 11:45:11AM +0100, Punit Agrawal wrote: > The ARMv8 architecture allows trapping of TLB maintenane instructions > from EL0/EL1 to higher exception levels. On encountering a trappable TLB > instruction in a guest, an exception is taken to EL2. > > Add functionality to handle emulating the TLB instructions. > > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > Cc: Christoffer Dall <christoffer.dall@linaro.org> > Cc: Marc Zyngier <marc.zyngier@arm.com> [...] > +void __hyp_text > +__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 sys_op, u64 regval) > +{ > + kvm = kern_hyp_va(kvm); > + > + /* > + * Switch to the guest before performing any TLB operations to > + * target the appropriate VMID > + */ > + __switch_to_guest_regime(kvm); > + > + /* > + * TLB maintenance operations broadcast to inner-shareable > + * domain when HCR_FB is set (default for KVM). > + */ > + switch (sys_op) { > + case TLBIALL: > + case TLBIALLIS: > + case ITLBIALL: > + case DTLBIALL: > + case TLBI_VMALLE1: > + case TLBI_VMALLE1IS: > + __tlbi(vmalle1is); > + break; > + case TLBIMVA: > + case TLBIMVAIS: > + case ITLBIMVA: > + case DTLBIMVA: > + case TLBI_VAE1: > + case TLBI_VAE1IS: > + __tlbi(vae1is, regval); I'm pretty nervous about this. Although you've switched in the guest stage-2 page table before the TLB maintenance, we're still running on a host stage-1 and it's not clear to me that the stage-1 context is completely ignored for the purposes of a stage-1 TLBI executed at EL2. For example, if TCR_EL1.TBI0 is set in the guest but cleared in the host, my reading of the architecture is that it will be treated as zero when we perform this invalidation operation. I worry that we have similar problems with the granule size, where bits become RES0 in the TLBI VA ops. Finally, we should probably be masking out the RES0 bits in the TLBI ops, just in case some future extension to the architecture defines them in such a way where they have different meanings when executed at EL2 or EL1. The easiest thing to do is just TLBI VMALLE1IS for all trapped operations, but you might want to see how that performs. Will ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions 2016-08-19 15:18 ` Will Deacon @ 2016-08-24 10:40 ` Punit Agrawal 2016-08-26 9:37 ` Punit Agrawal 0 siblings, 1 reply; 22+ messages in thread From: Punit Agrawal @ 2016-08-24 10:40 UTC (permalink / raw) To: Will Deacon Cc: kvm, Marc Zyngier, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel Will Deacon <will.deacon@arm.com> writes: > Hi Punit, > > On Tue, Aug 16, 2016 at 11:45:11AM +0100, Punit Agrawal wrote: >> The ARMv8 architecture allows trapping of TLB maintenane instructions >> from EL0/EL1 to higher exception levels. On encountering a trappable TLB >> instruction in a guest, an exception is taken to EL2. >> >> Add functionality to handle emulating the TLB instructions. >> >> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> >> Cc: Christoffer Dall <christoffer.dall@linaro.org> >> Cc: Marc Zyngier <marc.zyngier@arm.com> > > [...] > >> +void __hyp_text >> +__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 sys_op, u64 regval) >> +{ >> + kvm = kern_hyp_va(kvm); >> + >> + /* >> + * Switch to the guest before performing any TLB operations to >> + * target the appropriate VMID >> + */ >> + __switch_to_guest_regime(kvm); >> + >> + /* >> + * TLB maintenance operations broadcast to inner-shareable >> + * domain when HCR_FB is set (default for KVM). >> + */ >> + switch (sys_op) { >> + case TLBIALL: >> + case TLBIALLIS: >> + case ITLBIALL: >> + case DTLBIALL: >> + case TLBI_VMALLE1: >> + case TLBI_VMALLE1IS: >> + __tlbi(vmalle1is); >> + break; >> + case TLBIMVA: >> + case TLBIMVAIS: >> + case ITLBIMVA: >> + case DTLBIMVA: >> + case TLBI_VAE1: >> + case TLBI_VAE1IS: >> + __tlbi(vae1is, regval); > > I'm pretty nervous about this. Although you've switched in the guest stage-2 > page table before the TLB maintenance, we're still running on a host stage-1 > and it's not clear to me that the stage-1 context is completely ignored for > the purposes of a stage-1 TLBI executed at EL2. > > For example, if TCR_EL1.TBI0 is set in the guest but cleared in the host, > my reading of the architecture is that it will be treated as zero when > we perform this invalidation operation. I worry that we have similar > problems with the granule size, where bits become RES0 in the TLBI VA > ops. Some control bits seem to be explicitly called out to not affect TLB maintenance operations[0] but I hadn't considered the ones you highlight. [0] ARMv8 ARM DDI 0487A.j D4.7, Pg D4-1814 > > Finally, we should probably be masking out the RES0 bits in the TLBI > ops, just in case some future extension to the architecture defines them > in such a way where they have different meanings when executed at EL2 > or EL1. Although, the RES0 bits for TLBI VA ops are currently ignored, I agree that masking them out based on granule size protects against future incompatible changes. > > The easiest thing to do is just TLBI VMALLE1IS for all trapped operations, > but you might want to see how that performs. That sounds reasonable for correctness. But I suspect we'll have to do more to claw back some performance. Let me run a few tests and come back on this. Thanks for having a look. Punit > > Will > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions 2016-08-24 10:40 ` Punit Agrawal @ 2016-08-26 9:37 ` Punit Agrawal 2016-08-26 12:21 ` Marc Zyngier 2016-09-01 14:55 ` Will Deacon 0 siblings, 2 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-26 9:37 UTC (permalink / raw) To: Will Deacon Cc: kvm, Marc Zyngier, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel Punit Agrawal <punit.agrawal@arm.com> writes: > Will Deacon <will.deacon@arm.com> writes: > >> Hi Punit, >> >> On Tue, Aug 16, 2016 at 11:45:11AM +0100, Punit Agrawal wrote: >>> The ARMv8 architecture allows trapping of TLB maintenane instructions >>> from EL0/EL1 to higher exception levels. On encountering a trappable TLB >>> instruction in a guest, an exception is taken to EL2. >>> >>> Add functionality to handle emulating the TLB instructions. >>> >>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> >>> Cc: Christoffer Dall <christoffer.dall@linaro.org> >>> Cc: Marc Zyngier <marc.zyngier@arm.com> >> >> [...] >> >>> +void __hyp_text >>> +__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 sys_op, u64 regval) >>> +{ >>> + kvm = kern_hyp_va(kvm); >>> + >>> + /* >>> + * Switch to the guest before performing any TLB operations to >>> + * target the appropriate VMID >>> + */ >>> + __switch_to_guest_regime(kvm); >>> + >>> + /* >>> + * TLB maintenance operations broadcast to inner-shareable >>> + * domain when HCR_FB is set (default for KVM). >>> + */ >>> + switch (sys_op) { >>> + case TLBIALL: >>> + case TLBIALLIS: >>> + case ITLBIALL: >>> + case DTLBIALL: >>> + case TLBI_VMALLE1: >>> + case TLBI_VMALLE1IS: >>> + __tlbi(vmalle1is); >>> + break; >>> + case TLBIMVA: >>> + case TLBIMVAIS: >>> + case ITLBIMVA: >>> + case DTLBIMVA: >>> + case TLBI_VAE1: >>> + case TLBI_VAE1IS: >>> + __tlbi(vae1is, regval); >> >> I'm pretty nervous about this. Although you've switched in the guest stage-2 >> page table before the TLB maintenance, we're still running on a host stage-1 >> and it's not clear to me that the stage-1 context is completely ignored for >> the purposes of a stage-1 TLBI executed at EL2. >> >> For example, if TCR_EL1.TBI0 is set in the guest but cleared in the host, >> my reading of the architecture is that it will be treated as zero when >> we perform this invalidation operation. I worry that we have similar >> problems with the granule size, where bits become RES0 in the TLBI VA >> ops. > > Some control bits seem to be explicitly called out to not affect TLB > maintenance operations[0] but I hadn't considered the ones you highlight. > > [0] ARMv8 ARM DDI 0487A.j D4.7, Pg D4-1814 > >> >> Finally, we should probably be masking out the RES0 bits in the TLBI >> ops, just in case some future extension to the architecture defines them >> in such a way where they have different meanings when executed at EL2 >> or EL1. > > Although, the RES0 bits for TLBI VA ops are currently ignored, I agree > that masking them out based on granule size protects against future > incompatible changes. > >> >> The easiest thing to do is just TLBI VMALLE1IS for all trapped operations, >> but you might want to see how that performs. > > That sounds reasonable for correctness. But I suspect we'll have to do > more to claw back some performance. Let me run a few tests and come back > on this. Assuming I've correctly switched in TCR and replacing the various TLB operations in this patch with TLBI VMALLE1IS, there is a drop in kernel build times of ~5% (384s vs 363s). For the next version, I'll use this as a starting point and try clawing back the loss by using the appropriate TLB instructions albeit with additional sanity checking based on context. > > Thanks for having a look. > > Punit > >> >> Will >> _______________________________________________ >> kvmarm mailing list >> kvmarm@lists.cs.columbia.edu >> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions 2016-08-26 9:37 ` Punit Agrawal @ 2016-08-26 12:21 ` Marc Zyngier 2016-09-01 14:55 ` Will Deacon 1 sibling, 0 replies; 22+ messages in thread From: Marc Zyngier @ 2016-08-26 12:21 UTC (permalink / raw) To: Punit Agrawal Cc: Will Deacon, kvm, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel On Fri, 26 Aug 2016 10:37:08 +0100 Punit Agrawal <punit.agrawal@arm.com> wrote: > Punit Agrawal <punit.agrawal@arm.com> writes: > > > Will Deacon <will.deacon@arm.com> writes: > > > >> Hi Punit, > >> > >> On Tue, Aug 16, 2016 at 11:45:11AM +0100, Punit Agrawal wrote: > >>> The ARMv8 architecture allows trapping of TLB maintenane instructions > >>> from EL0/EL1 to higher exception levels. On encountering a trappable TLB > >>> instruction in a guest, an exception is taken to EL2. > >>> > >>> Add functionality to handle emulating the TLB instructions. > >>> > >>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > >>> Cc: Christoffer Dall <christoffer.dall@linaro.org> > >>> Cc: Marc Zyngier <marc.zyngier@arm.com> > >> > >> [...] > >> > >>> +void __hyp_text > >>> +__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 sys_op, u64 regval) > >>> +{ > >>> + kvm = kern_hyp_va(kvm); > >>> + > >>> + /* > >>> + * Switch to the guest before performing any TLB operations to > >>> + * target the appropriate VMID > >>> + */ > >>> + __switch_to_guest_regime(kvm); > >>> + > >>> + /* > >>> + * TLB maintenance operations broadcast to inner-shareable > >>> + * domain when HCR_FB is set (default for KVM). > >>> + */ > >>> + switch (sys_op) { > >>> + case TLBIALL: > >>> + case TLBIALLIS: > >>> + case ITLBIALL: > >>> + case DTLBIALL: > >>> + case TLBI_VMALLE1: > >>> + case TLBI_VMALLE1IS: > >>> + __tlbi(vmalle1is); > >>> + break; > >>> + case TLBIMVA: > >>> + case TLBIMVAIS: > >>> + case ITLBIMVA: > >>> + case DTLBIMVA: > >>> + case TLBI_VAE1: > >>> + case TLBI_VAE1IS: > >>> + __tlbi(vae1is, regval); > >> > >> I'm pretty nervous about this. Although you've switched in the guest stage-2 > >> page table before the TLB maintenance, we're still running on a host stage-1 > >> and it's not clear to me that the stage-1 context is completely ignored for > >> the purposes of a stage-1 TLBI executed at EL2. > >> > >> For example, if TCR_EL1.TBI0 is set in the guest but cleared in the host, > >> my reading of the architecture is that it will be treated as zero when > >> we perform this invalidation operation. I worry that we have similar > >> problems with the granule size, where bits become RES0 in the TLBI VA > >> ops. > > > > Some control bits seem to be explicitly called out to not affect TLB > > maintenance operations[0] but I hadn't considered the ones you highlight. > > > > [0] ARMv8 ARM DDI 0487A.j D4.7, Pg D4-1814 > > > >> > >> Finally, we should probably be masking out the RES0 bits in the TLBI > >> ops, just in case some future extension to the architecture defines them > >> in such a way where they have different meanings when executed at EL2 > >> or EL1. > > > > Although, the RES0 bits for TLBI VA ops are currently ignored, I agree > > that masking them out based on granule size protects against future > > incompatible changes. > > > >> > >> The easiest thing to do is just TLBI VMALLE1IS for all trapped operations, > >> but you might want to see how that performs. > > > > That sounds reasonable for correctness. But I suspect we'll have to do > > more to claw back some performance. Let me run a few tests and come back > > on this. > > Assuming I've correctly switched in TCR and replacing the various TLB > operations in this patch with TLBI VMALLE1IS, there is a drop in kernel > build times of ~5% (384s vs 363s). Note that if all you're doing is a VMALLE1IS, switching TCR_EL1 should not be necessary, as all that is required for this invalidation is the VMID. > For the next version, I'll use this as a starting point and try clawing > back the loss by using the appropriate TLB instructions albeit with > additional sanity checking based on context. Great, thanks! M. -- Jazz is not dead. It just smells funny. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions 2016-08-26 9:37 ` Punit Agrawal 2016-08-26 12:21 ` Marc Zyngier @ 2016-09-01 14:55 ` Will Deacon 2016-09-01 18:29 ` Punit Agrawal 1 sibling, 1 reply; 22+ messages in thread From: Will Deacon @ 2016-09-01 14:55 UTC (permalink / raw) To: Punit Agrawal Cc: kvm, Marc Zyngier, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel On Fri, Aug 26, 2016 at 10:37:08AM +0100, Punit Agrawal wrote: > > Will Deacon <will.deacon@arm.com> writes: > >> The easiest thing to do is just TLBI VMALLE1IS for all trapped operations, > >> but you might want to see how that performs. > > > > That sounds reasonable for correctness. But I suspect we'll have to do > > more to claw back some performance. Let me run a few tests and come back > > on this. > > Assuming I've correctly switched in TCR and replacing the various TLB > operations in this patch with TLBI VMALLE1IS, there is a drop in kernel > build times of ~5% (384s vs 363s). What do you mean by "switched in TCR"? Why is that necessary if you just nuke the whole thing? Is the ~5% relative to no trapping at all, or trapping, but being selective about the operation? Will ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions 2016-09-01 14:55 ` Will Deacon @ 2016-09-01 18:29 ` Punit Agrawal 0 siblings, 0 replies; 22+ messages in thread From: Punit Agrawal @ 2016-09-01 18:29 UTC (permalink / raw) To: Will Deacon Cc: kvm, Marc Zyngier, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel Will Deacon <will.deacon@arm.com> writes: > On Fri, Aug 26, 2016 at 10:37:08AM +0100, Punit Agrawal wrote: >> > Will Deacon <will.deacon@arm.com> writes: >> >> The easiest thing to do is just TLBI VMALLE1IS for all trapped operations, >> >> but you might want to see how that performs. >> > >> > That sounds reasonable for correctness. But I suspect we'll have to do >> > more to claw back some performance. Let me run a few tests and come back >> > on this. >> >> Assuming I've correctly switched in TCR and replacing the various TLB >> operations in this patch with TLBI VMALLE1IS, there is a drop in kernel >> build times of ~5% (384s vs 363s). > > What do you mean by "switched in TCR"? Why is that necessary if you just > nuke the whole thing? You're right. it's not necessary. I'd misunderstood how TCR affects things and was switching it in the above tests. > Is the ~5% relative to no trapping at all, or > trapping, but being selective about the operation? The reported number was relative to trapping and being selective about the operation. But I hadn't been careful in ensuring identical conditions (page caches, etc.) when running the numbers. So I've done a fresh set of identical measurements by running "time make -j 7" in a VM booted with 7 vcpus and see the following results 1. no trapping ~ 365s 2. traps using selective tlb operations ~ 371s 3. traps that nuke all stage 1 (tlbi vmalle1is) ~ 393s So based on these measurements there is ~1% and ~7.5% drop in comparison between 2. and 3. compared to the base case of no trapping at all. Thanks, Punit > > Will > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 7/7] arm64: KVM: Enable selective trapping of TLB instructions 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal ` (5 preceding siblings ...) 2016-08-16 10:45 ` [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions Punit Agrawal @ 2016-08-16 10:45 ` Punit Agrawal 2016-08-17 15:58 ` [RFC PATCH 0/7] Add support for monitoring guest TLB operations Paolo Bonzini 7 siblings, 0 replies; 22+ messages in thread From: Punit Agrawal @ 2016-08-16 10:45 UTC (permalink / raw) To: linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Punit Agrawal, Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon The TTLB bit of Hypervisor Control Register (HCR_EL2) controls the trapping of guest TLB maintenance instructions. Taking the trap requires a switch to the hypervisor and is an expensive operation. Enable selective trapping of guest TLB instructions when the associated perf trace event is enabled for a specific virtual machine. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> --- arch/arm64/kvm/perf_trace.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/arch/arm64/kvm/perf_trace.c b/arch/arm64/kvm/perf_trace.c index 8bacd18..f26da1d 100644 --- a/arch/arm64/kvm/perf_trace.c +++ b/arch/arm64/kvm/perf_trace.c @@ -17,6 +17,8 @@ #include <linux/kvm_host.h> #include <linux/trace_events.h> +#include <asm/kvm_emulate.h> + typedef int (*perf_trace_callback_fn)(struct kvm *kvm, bool enable); struct kvm_trace_hook { @@ -24,7 +26,37 @@ struct kvm_trace_hook { perf_trace_callback_fn setup_fn; }; +static int tlb_invalidate_trap(struct kvm *kvm, bool enable) +{ + int i; + struct kvm_vcpu *vcpu; + + /* + * Halt the VM to ensure atomic update across all vcpus (this + * avoids racy behaviour against other modifications of + * HCR_EL2 such as kvm_toggle_cache/kvm_set_way_flush). + */ + kvm_arm_halt_guest(kvm); + kvm_for_each_vcpu(i, vcpu, kvm) { + unsigned long hcr = vcpu_get_hcr(vcpu); + + if (enable) + hcr |= HCR_TTLB; + else + hcr &= ~HCR_TTLB; + + vcpu_set_hcr(vcpu, hcr); + } + kvm_arm_resume_guest(kvm); + + return 0; +} + static struct kvm_trace_hook trace_hook[] = { + { + .key = "kvm_tlb_invalidate", + .setup_fn = tlb_invalidate_trap, + }, { }, }; -- 2.8.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 0/7] Add support for monitoring guest TLB operations 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal ` (6 preceding siblings ...) 2016-08-16 10:45 ` [RFC PATCH 7/7] arm64: KVM: Enable selective trapping of " Punit Agrawal @ 2016-08-17 15:58 ` Paolo Bonzini 2016-08-17 17:01 ` Punit Agrawal 7 siblings, 1 reply; 22+ messages in thread From: Paolo Bonzini @ 2016-08-17 15:58 UTC (permalink / raw) To: Punit Agrawal, linux-kernel, kvm, kvmarm, linux-arm-kernel Cc: Christoffer Dall, Marc Zyngier, Steven Rostedt, Ingo Molnar, Will Deacon On 16/08/2016 12:45, Punit Agrawal wrote: > Hi, > > ARMv8 supports trapping guest TLB maintenance operations to the > hypervisor. This trapping mechanism can be used to monitor the use of > guest TLB instructions. > > As taking a trap for every TLB operation can have significant > overhead, trapping should only be enabled - > > * on user request > * for the VM of interest > > This patchset adds support to listen to perf trace event state change > notifications. The notifications and associated context are then used > to enable trapping of guest TLB operations when requested by the > user. The trap handling generates trace events (kvm_tlb_invalidate) > which can already be counted using existing perf trace > functionality. > > Trapping of guest TLB operations is disabled when not being monitored > (reducing profiling overhead). > > I would appreciate feedback on the approach to tie the control of TLB > monitoring with perf trace events (Patch 1) especially if there are > any suggestions on avoiding (or reducing) the overhead of "perf trace" > notifications. > > I looked at using regfunc/unregfunc tracepoint hooks but they don't > include the event context. But the bigger problem was that the > callbacks are only called on the first instance of simultaneously > executing perf stat invocations. > > The patchset is based on v4.8-rc2 and adds support for monitoring > guest TLB operations on 64bit hosts. If the approach taken in the > patches is acceptable, I'll add 32bit host support as well. > > With this patchset, 'perf' tool when attached to a VM process can be > used to monitor the TLB operations. E.g., to monitor a VM with process > id 4166 - > > # perf stat -e "kvm:kvm_tlb_invalidate" -p 4166 > > Perform some operations in VM (running 'make -j 7' on the kernel > sources in this instance). Breaking out of perf shows - > > Performance counter stats for process id '4166': > > 7,471,974 kvm:kvm_tlb_invalidate > > 374.235405282 seconds time elapsed > > All feedback welcome. Can you explain what this is used for? In other words, why would this be used instead of just running perf in the guest? Thanks, Paolo ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 0/7] Add support for monitoring guest TLB operations 2016-08-17 15:58 ` [RFC PATCH 0/7] Add support for monitoring guest TLB operations Paolo Bonzini @ 2016-08-17 17:01 ` Punit Agrawal 2016-08-17 17:02 ` Paolo Bonzini 0 siblings, 1 reply; 22+ messages in thread From: Punit Agrawal @ 2016-08-17 17:01 UTC (permalink / raw) To: Paolo Bonzini Cc: linux-kernel, kvm, kvmarm, linux-arm-kernel, Marc Zyngier, Ingo Molnar, Will Deacon, Steven Rostedt Paolo Bonzini <pbonzini@redhat.com> writes: > On 16/08/2016 12:45, Punit Agrawal wrote: >> Hi, >> >> ARMv8 supports trapping guest TLB maintenance operations to the >> hypervisor. This trapping mechanism can be used to monitor the use of >> guest TLB instructions. >> >> As taking a trap for every TLB operation can have significant >> overhead, trapping should only be enabled - >> >> * on user request >> * for the VM of interest >> >> This patchset adds support to listen to perf trace event state change >> notifications. The notifications and associated context are then used >> to enable trapping of guest TLB operations when requested by the >> user. The trap handling generates trace events (kvm_tlb_invalidate) >> which can already be counted using existing perf trace >> functionality. >> >> Trapping of guest TLB operations is disabled when not being monitored >> (reducing profiling overhead). >> >> I would appreciate feedback on the approach to tie the control of TLB >> monitoring with perf trace events (Patch 1) especially if there are >> any suggestions on avoiding (or reducing) the overhead of "perf trace" >> notifications. >> >> I looked at using regfunc/unregfunc tracepoint hooks but they don't >> include the event context. But the bigger problem was that the >> callbacks are only called on the first instance of simultaneously >> executing perf stat invocations. >> >> The patchset is based on v4.8-rc2 and adds support for monitoring >> guest TLB operations on 64bit hosts. If the approach taken in the >> patches is acceptable, I'll add 32bit host support as well. >> >> With this patchset, 'perf' tool when attached to a VM process can be >> used to monitor the TLB operations. E.g., to monitor a VM with process >> id 4166 - >> >> # perf stat -e "kvm:kvm_tlb_invalidate" -p 4166 >> >> Perform some operations in VM (running 'make -j 7' on the kernel >> sources in this instance). Breaking out of perf shows - >> >> Performance counter stats for process id '4166': >> >> 7,471,974 kvm:kvm_tlb_invalidate >> >> 374.235405282 seconds time elapsed >> >> All feedback welcome. > > Can you explain what this is used for? In other words, why would this > be used instead of just running perf in the guest? As TLB maintenance operations are synchronised in hardware, they can impact performance beyond the guest. The operations generate traffic on the interconnect and depending on the implementation, they can also affect the remote TLB's translation bandwidth. These patches are useful on systems where the host and guest are controlled by different users - the guest could be running arbitrary software. Having the ability to monitor the usage of guest TLB invalidations in the host can be useful to diagnose performance issues on such systems. > > Thanks, > > Paolo > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 0/7] Add support for monitoring guest TLB operations 2016-08-17 17:01 ` Punit Agrawal @ 2016-08-17 17:02 ` Paolo Bonzini 2016-08-17 17:20 ` Punit Agrawal 0 siblings, 1 reply; 22+ messages in thread From: Paolo Bonzini @ 2016-08-17 17:02 UTC (permalink / raw) To: Punit Agrawal Cc: linux-kernel, kvm, kvmarm, linux-arm-kernel, Marc Zyngier, Ingo Molnar, Will Deacon, Steven Rostedt On 17/08/2016 19:01, Punit Agrawal wrote: >> Can you explain what this is used for? In other words, why would this >> be used instead of just running perf in the guest? > > As TLB maintenance operations are synchronised in hardware, they can > impact performance beyond the guest. The operations generate traffic on > the interconnect and depending on the implementation, they can also > affect the remote TLB's translation bandwidth. > > These patches are useful on systems where the host and guest are > controlled by different users - the guest could be running arbitrary > software. > > Having the ability to monitor the usage of guest TLB invalidations in > the host can be useful to diagnose performance issues on such systems. Are there hardware performance counters for these operations? Paolo ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 0/7] Add support for monitoring guest TLB operations 2016-08-17 17:02 ` Paolo Bonzini @ 2016-08-17 17:20 ` Punit Agrawal 2016-08-18 7:04 ` Paolo Bonzini 0 siblings, 1 reply; 22+ messages in thread From: Punit Agrawal @ 2016-08-17 17:20 UTC (permalink / raw) To: Paolo Bonzini Cc: kvm, Marc Zyngier, Will Deacon, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel Paolo Bonzini <pbonzini@redhat.com> writes: > On 17/08/2016 19:01, Punit Agrawal wrote: >>> Can you explain what this is used for? In other words, why would this >>> be used instead of just running perf in the guest? >> >> As TLB maintenance operations are synchronised in hardware, they can >> impact performance beyond the guest. The operations generate traffic on >> the interconnect and depending on the implementation, they can also >> affect the remote TLB's translation bandwidth. >> >> These patches are useful on systems where the host and guest are >> controlled by different users - the guest could be running arbitrary >> software. >> >> Having the ability to monitor the usage of guest TLB invalidations in >> the host can be useful to diagnose performance issues on such systems. > > Are there hardware performance counters for these operations? That would have been ideal! There are PMU events defined for TLB accesses and refills but unhelpfully none of them track maintenance operations. > > Paolo > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 0/7] Add support for monitoring guest TLB operations 2016-08-17 17:20 ` Punit Agrawal @ 2016-08-18 7:04 ` Paolo Bonzini 0 siblings, 0 replies; 22+ messages in thread From: Paolo Bonzini @ 2016-08-18 7:04 UTC (permalink / raw) To: Punit Agrawal Cc: kvm, Marc Zyngier, Will Deacon, linux-kernel, Steven Rostedt, Ingo Molnar, kvmarm, linux-arm-kernel On 17/08/2016 19:20, Punit Agrawal wrote: > > > Having the ability to monitor the usage of guest TLB invalidations in > > > the host can be useful to diagnose performance issues on such systems. > > > > Are there hardware performance counters for these operations? > > That would have been ideal! There are PMU events defined for TLB > accesses and refills but unhelpfully none of them track maintenance > operations. I guess the patches do make sense then. :) Paolo ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2016-09-01 21:49 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-08-16 10:45 [RFC PATCH 0/7] Add support for monitoring guest TLB operations Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 1/7] perf/trace: Add notification for perf trace events Punit Agrawal 2016-08-31 11:01 ` Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 2/7] KVM: Track the pid of the VM process Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 3/7] KVM: arm/arm64: Register perf trace event notifier Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 4/7] arm64: tlbflush.h: add __tlbi() macro Punit Agrawal 2016-08-19 13:24 ` Will Deacon 2016-08-19 13:34 ` Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 5/7] arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions Punit Agrawal 2016-08-19 15:18 ` Will Deacon 2016-08-24 10:40 ` Punit Agrawal 2016-08-26 9:37 ` Punit Agrawal 2016-08-26 12:21 ` Marc Zyngier 2016-09-01 14:55 ` Will Deacon 2016-09-01 18:29 ` Punit Agrawal 2016-08-16 10:45 ` [RFC PATCH 7/7] arm64: KVM: Enable selective trapping of " Punit Agrawal 2016-08-17 15:58 ` [RFC PATCH 0/7] Add support for monitoring guest TLB operations Paolo Bonzini 2016-08-17 17:01 ` Punit Agrawal 2016-08-17 17:02 ` Paolo Bonzini 2016-08-17 17:20 ` Punit Agrawal 2016-08-18 7:04 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).