linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/9] Add support for monitoring guest TLB operations
@ 2017-01-10 11:38 Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 1/9] arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Steven Rostedt, Peter Zijlstra, Will Deacon

Hi,

This is a new version of the patchset to monitor guest TLB
operations. The user interface has been re-written to incorporate
feedback from LPC'16 on the previous version - it now uses a software
PMU instead of relying on perf trace to track guest TLB operations
(Patch 6 and 7). Previous versions of the patches can be found at
[0][1][2][3].

Guest TLB operations can impact on system performance but these are
not exported as architected PMU events on arm/arm64. Instead the
architecture allows trapping of TLB operations to the hypervisor. This
patchset builds on this feature to monitor TLB operations.

To minimise the performance impact, trapping is enabled -
* on user request
* for the VM of interest

With this patchset, running 'perf' on the host can be used to monitor
the TLB operations. E.g., to monitor a VM with process id 2589 -

# perf stat -a -C 0 -e kvm/kvm_tlb_invalidate,vm=2589/ sleep 25

 Performance counter stats for 'system wide':

             3,386      kvm/tlb_invalidate,vm=2589/

      25.001086522 seconds time elapsed


The patches are based on v4.10-rc3 and have been tested on arm and
arm64.

Thanks,
Punit

[0] http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1210715.html
[1] http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1224353.html
[2] https://marc.info/?l=linux-kernel&m=147376184208258&w=2
[3] https://marc.info/?l=kvm&m=147750373716545&w=2

Changes:
v2 -> v3
* Replaced perf trace monitoring with software PMU
* Re-ordered patches as a result of the above re-write

v1 -> v2

* New (Patch 6) - Add support for trapping and emulating TLB
  operations to ARM hosts
* Move common code to handle perf trace notifications to virt/kvm/arm
* Move tracepoint to include/trace/events/kvm.h
* Drop patch to introduce __tlbi helper as it is now merged
* Reorder patches

RFC v2 -> v1
* Dropped the RFC tag
* Patch 2 - Use VM thread group id for identification
* Patch 4 - Update comment for clarity
* Patch 6 - Add comment explaining switch to hype-role when VHE is enabled
* Patch 7 - Add comment to clarify struct kvm_trace_hook

RFC -> RFC v2
* Patch 4 - Rename left-over TLBI macro to __TLBI
* Patch 6 - Replace individual TLB operation emulation with
  invalidating all stage 1 TLB for the VM. TLB monitoring is expected
  to be a debug feature and performance is not critical.



Mark Rutland (1):
  arm64/kvm: hyp: tlb: use __tlbi() helper

Punit Agrawal (8):
  KVM: Track the pid of the VM process
  KVM: Add event to trace tlb invalidations
  arm: KVM: Handle trappable TLB instructions
  arm64: KVM: Handle trappable TLB instructions
  kvm: arm/arm64: Add host pmu to support VM introspection
  kvm: host_pmu: Add support for tracking guest TLB operations
  arm: KVM: Enable support for host pmu
  arm64: KVM: Enable support for the host pmu

 arch/arm/include/asm/kvm_asm.h    |   2 +
 arch/arm/include/asm/kvm_host.h   |   9 ++
 arch/arm/kvm/Kconfig              |   4 +
 arch/arm/kvm/Makefile             |   1 +
 arch/arm/kvm/arm.c                |   2 +
 arch/arm/kvm/coproc.c             |  56 +++++++
 arch/arm/kvm/hyp/tlb.c            |  33 ++++
 arch/arm64/include/asm/kvm_asm.h  |   2 +
 arch/arm64/include/asm/kvm_host.h |   9 ++
 arch/arm64/kvm/Kconfig            |   4 +
 arch/arm64/kvm/Makefile           |   1 +
 arch/arm64/kvm/hyp/tlb.c          |  87 +++++++++-
 arch/arm64/kvm/sys_regs.c         |  83 ++++++++++
 include/linux/kvm_host.h          |   1 +
 include/trace/events/kvm.h        |  18 +++
 virt/kvm/arm/host_pmu.c           | 322 ++++++++++++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c               |   2 +
 17 files changed, 630 insertions(+), 6 deletions(-)
 create mode 100644 virt/kvm/arm/host_pmu.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 1/9] arm64/kvm: hyp: tlb: use __tlbi() helper
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 2/9] KVM: Track the pid of the VM process Punit Agrawal
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Mark Rutland, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Will Deacon, Punit Agrawal

From: Mark Rutland <mark.rutland@arm.com>

Now that we have a __tlbi() helper, make use of this in the arm64 KVM hyp
code to get rid of asm() boilerplate. At the same time, we simplify
__tlb_flush_vm_context by using __flush_icache_all(), as this has the
appropriate instruction cache maintenance and barrier.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
[ rename tlbi -> __tlbi, convert additional sites, update commit log ]
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 arch/arm64/kvm/hyp/tlb.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c
index 88e2f2b938f0..b2cfbedea582 100644
--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -16,6 +16,7 @@
  */
 
 #include <asm/kvm_hyp.h>
+#include <asm/tlbflush.h>
 
 void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
@@ -32,7 +33,7 @@ void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 	 * whole of Stage-1. Weep...
 	 */
 	ipa >>= 12;
-	asm volatile("tlbi ipas2e1is, %0" : : "r" (ipa));
+	__tlbi(ipas2e1is, ipa);
 
 	/*
 	 * We have to ensure completion of the invalidation at Stage-2,
@@ -41,7 +42,7 @@ void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 	 * the Stage-1 invalidation happened first.
 	 */
 	dsb(ish);
-	asm volatile("tlbi vmalle1is" : : );
+	__tlbi(vmalle1is);
 	dsb(ish);
 	isb();
 
@@ -57,7 +58,7 @@ void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
 	write_sysreg(kvm->arch.vttbr, vttbr_el2);
 	isb();
 
-	asm volatile("tlbi vmalls12e1is" : : );
+	__tlbi(vmalls12e1is);
 	dsb(ish);
 	isb();
 
@@ -82,7 +83,6 @@ void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
 void __hyp_text __kvm_flush_vm_context(void)
 {
 	dsb(ishst);
-	asm volatile("tlbi alle1is	\n"
-		     "ic ialluis	  ": : );
-	dsb(ish);
+	__tlbi(alle1is);
+	__flush_icache_all(); /* contains a dsb(ish) */
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/9] KVM: Track the pid of the VM process
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 1/9] arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 3/9] KVM: Add event to trace tlb invalidations Punit Agrawal
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Paolo Bonzini, Radim Krčmář

Userspace tools such as perf can be used to profile individual
processes.

Track the PID of the virtual machine process to match profiling requests
targeted at it. This can be used to take appropriate action to enable
the requested profiling operations for the VMs of interest.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 include/linux/kvm_host.h | 1 +
 virt/kvm/kvm_main.c      | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 1c5190dab2c1..c666926529ca 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -374,6 +374,7 @@ struct kvm_memslots {
 struct kvm {
 	spinlock_t mmu_lock;
 	struct mutex slots_lock;
+	struct pid *pid;
 	struct mm_struct *mm; /* userspace tied to this vm */
 	struct kvm_memslots *memslots[KVM_ADDRESS_SPACE_NUM];
 	struct srcu_struct srcu;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 482612b4e496..59d8a84da227 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -618,6 +618,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
 	spin_lock_init(&kvm->mmu_lock);
 	atomic_inc(&current->mm->mm_count);
 	kvm->mm = current->mm;
+	kvm->pid = get_task_pid(current->group_leader, PIDTYPE_PID);
 	kvm_eventfd_init(kvm);
 	mutex_init(&kvm->lock);
 	mutex_init(&kvm->irq_lock);
@@ -717,6 +718,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
 	int i;
 	struct mm_struct *mm = kvm->mm;
 
+	put_pid(kvm->pid);
 	kvm_destroy_vm_debugfs(kvm);
 	kvm_arch_sync_events(kvm);
 	spin_lock(&kvm_lock);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/9] KVM: Add event to trace tlb invalidations
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 1/9] arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 2/9] KVM: Track the pid of the VM process Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 4/9] arm: KVM: Handle trappable TLB instructions Punit Agrawal
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Steven Rostedt, Paolo Bonzini

As TLB operations can have an impact on system performance, add a trace
event to enable monitoring of guest TLB maintenance operations.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---
 include/trace/events/kvm.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 8ade3eb6c640..fbe33089264c 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -393,6 +393,24 @@ TRACE_EVENT(kvm_halt_poll_ns,
 #define trace_kvm_halt_poll_ns_shrink(vcpu_id, new, old) \
 	trace_kvm_halt_poll_ns(false, vcpu_id, new, old)
 
+TRACE_EVENT(kvm_tlb_invalidate,
+	TP_PROTO(unsigned long vcpu_pc, u32 opcode),
+	TP_ARGS(vcpu_pc, opcode),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, vcpu_pc)
+		__field(u32, opcode)
+	),
+
+	TP_fast_assign(
+		__entry->vcpu_pc = vcpu_pc;
+		__entry->opcode = opcode;
+	),
+
+	TP_printk("vcpu_pc=0x%16lx opcode=%08x", __entry->vcpu_pc,
+		  __entry->opcode)
+);
+
 #endif /* _TRACE_KVM_MAIN_H */
 
 /* This part must be outside protection */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 4/9] arm: KVM: Handle trappable TLB instructions
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
                   ` (2 preceding siblings ...)
  2017-01-10 11:38 ` [PATCH v3 3/9] KVM: Add event to trace tlb invalidations Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 5/9] arm64: " Punit Agrawal
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Peter Zijlstra, Will Deacon

It is possible to enable selective trapping of guest TLB maintenance
instructions executed in lower privilege levels to HYP mode. This
feature can be used to monitor guest TLB operations.

Add support to emulate the TLB instructions when their execution traps
to hyp mode. Also keep track of the number of emulated operations.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_asm.h  |  2 ++
 arch/arm/include/asm/kvm_host.h |  1 +
 arch/arm/kvm/coproc.c           | 56 +++++++++++++++++++++++++++++++++++++++++
 arch/arm/kvm/hyp/tlb.c          | 33 ++++++++++++++++++++++++
 4 files changed, 92 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 8ef05381984b..782034a5a3c3 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -67,6 +67,8 @@ extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
+extern void __kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 opcode,
+					 u64 regval);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index d5423ab15ed5..26f0c8a0b790 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -205,6 +205,7 @@ struct kvm_vcpu_stat {
 	u64 mmio_exit_user;
 	u64 mmio_exit_kernel;
 	u64 exits;
+	u64 tlb_invalidate;
 };
 
 #define vcpu_cp15(v,r)	(v)->arch.ctxt.cp15[r]
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 3e5e4194ef86..b978b0bf211e 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -205,6 +205,24 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+static bool emulate_tlb_invalidate(struct kvm_vcpu *vcpu,
+				   const struct coproc_params *p,
+				   const struct coproc_reg *r)
+{
+	/*
+	 * Based on system register encoding from ARM v8 ARM
+	 * (DDI 0487A.k F5.1.103)
+	 */
+	u32 opcode = p->Op1 << 21 | p->CRn << 16 | p->Op2 << 5 | p->CRm << 0;
+
+	kvm_call_hyp(__kvm_emulate_tlb_invalidate,
+		     vcpu->kvm, opcode, p->Rt1);
+	trace_kvm_tlb_invalidate(*vcpu_pc(vcpu), opcode);
+	++vcpu->stat.tlb_invalidate;
+
+	return true;
+}
+
 /*
  * Generic accessor for VM registers. Only called as long as HCR_TVM
  * is set.  If the guest enables the MMU, we stop trapping the VM
@@ -354,6 +372,44 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn( 7), CRm( 6), Op1( 0), Op2( 2), is32, access_dcsw},
 	{ CRn( 7), CRm(10), Op1( 0), Op2( 2), is32, access_dcsw},
 	{ CRn( 7), CRm(14), Op1( 0), Op2( 2), is32, access_dcsw},
+
+	/* TLBIALLIS */
+	{ CRn( 8), CRm( 3), Op1( 0), Op2( 0), is32, emulate_tlb_invalidate},
+	/* TLBIMVAIS */
+	{ CRn( 8), CRm( 3), Op1( 0), Op2( 1), is32, emulate_tlb_invalidate},
+	/* TLBIASIDIS */
+	{ CRn( 8), CRm( 3), Op1( 0), Op2( 2), is32, emulate_tlb_invalidate},
+	/* TLBIMVAAIS */
+	{ CRn( 8), CRm( 3), Op1( 0), Op2( 3), is32, emulate_tlb_invalidate},
+	/* TLBIMVALIS */
+	{ CRn( 8), CRm( 3), Op1( 0), Op2( 5), is32, emulate_tlb_invalidate},
+	/* TLBIMVAALIS */
+	{ CRn( 8), CRm( 3), Op1( 0), Op2( 7), is32, emulate_tlb_invalidate},
+	/* ITLBIALL */
+	{ CRn( 8), CRm( 5), Op1( 0), Op2( 0), is32, emulate_tlb_invalidate},
+	/* ITLBIMVA */
+	{ CRn( 8), CRm( 5), Op1( 0), Op2( 1), is32, emulate_tlb_invalidate},
+	/* ITLBIASID */
+	{ CRn( 8), CRm( 5), Op1( 0), Op2( 2), is32, emulate_tlb_invalidate},
+	/* DTLBIALL */
+	{ CRn( 8), CRm( 6), Op1( 0), Op2( 0), is32, emulate_tlb_invalidate},
+	/* DTLBIMVA */
+	{ CRn( 8), CRm( 6), Op1( 0), Op2( 1), is32, emulate_tlb_invalidate},
+	/* DTLBIASID */
+	{ CRn( 8), CRm( 6), Op1( 0), Op2( 2), is32, emulate_tlb_invalidate},
+	/* TLBIALL */
+	{ CRn( 8), CRm( 7), Op1( 0), Op2( 0), is32, emulate_tlb_invalidate},
+	/* TLBIMVA */
+	{ CRn( 8), CRm( 7), Op1( 0), Op2( 1), is32, emulate_tlb_invalidate},
+	/* TLBIASID */
+	{ CRn( 8), CRm( 7), Op1( 0), Op2( 2), is32, emulate_tlb_invalidate},
+	/* TLBIMVAA */
+	{ CRn( 8), CRm( 7), Op1( 0), Op2( 3), is32, emulate_tlb_invalidate},
+	/* TLBIMVAL */
+	{ CRn( 8), CRm( 7), Op1( 0), Op2( 5), is32, emulate_tlb_invalidate},
+	/* TLBIMVAAL */
+	{ CRn( 8), CRm( 7), Op1( 0), Op2( 7), is32, emulate_tlb_invalidate},
+
 	/*
 	 * L2CTLR access (guest wants to know #CPUs).
 	 */
diff --git a/arch/arm/kvm/hyp/tlb.c b/arch/arm/kvm/hyp/tlb.c
index 6d810af2d9fd..d2b86100d1bb 100644
--- a/arch/arm/kvm/hyp/tlb.c
+++ b/arch/arm/kvm/hyp/tlb.c
@@ -76,3 +76,36 @@ void __hyp_text __kvm_flush_vm_context(void)
 	write_sysreg(0, ICIALLUIS);
 	dsb(ish);
 }
+
+static void __hyp_text __switch_to_guest_regime(struct kvm *kvm)
+{
+	write_sysreg(kvm->arch.vttbr, VTTBR);
+	isb();
+}
+
+static void __hyp_text __switch_to_host_regime(void)
+{
+	write_sysreg(0, VTTBR);
+}
+
+void __hyp_text
+__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 opcode, u64 regval)
+{
+	kvm = kern_hyp_va(kvm);
+
+	__switch_to_guest_regime(kvm);
+
+	/*
+	 *  TLB maintenance operations are broadcast to
+	 *  inner-shareable domain when HCR_FB is set (default for
+	 *  KVM).
+	 *
+	 *  Nuke all Stage 1 TLB entries for the VM. This will kill
+	 *  performance but it's always safe to do as we don't leave
+	 *  behind any strays in the TLB
+	 */
+	write_sysreg(0, TLBIALLIS);
+	isb();
+
+	__switch_to_host_regime();
+}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 5/9] arm64: KVM: Handle trappable TLB instructions
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
                   ` (3 preceding siblings ...)
  2017-01-10 11:38 ` [PATCH v3 4/9] arm: KVM: Handle trappable TLB instructions Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection Punit Agrawal
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Peter Zijlstra, Will Deacon

The ARMv8 architecture allows trapping of TLB maintenane instructions
from EL0/EL1 to higher exception levels. On encountering a trappable TLB
instruction in a guest, an exception is taken to EL2.

Add support to handle emulating the TLB instructions. Also track the
number of emulated operations.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h  |  2 +
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/hyp/tlb.c          | 75 +++++++++++++++++++++++++++++++++++
 arch/arm64/kvm/sys_regs.c         | 83 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 161 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index ec3553eb9349..8b695785d454 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -55,6 +55,8 @@ extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
+extern void __kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 opcode,
+					 u64 regval);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e5050388e062..1e83b707f14c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,6 +307,7 @@ struct kvm_vcpu_stat {
 	u64 mmio_exit_user;
 	u64 mmio_exit_kernel;
 	u64 exits;
+	u64 tlb_invalidate;
 };
 
 int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init);
diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c
index b2cfbedea582..c14237858d75 100644
--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -86,3 +86,78 @@ void __hyp_text __kvm_flush_vm_context(void)
 	__tlbi(alle1is);
 	__flush_icache_all(); /* contains a dsb(ish) */
 }
+
+/* Intentionally empty functions */
+static void __hyp_text __switch_to_hyp_role_nvhe(void) { }
+static void __hyp_text __switch_to_host_role_nvhe(void) { }
+
+static void __hyp_text __switch_to_hyp_role_vhe(void)
+{
+	u64 hcr = read_sysreg(hcr_el2);
+
+	/*
+	 * When VHE is enabled and HCR_EL2.TGE=1, EL1&0 TLB operations
+	 * apply to EL2&0 translation regime. As we prepare to emulate
+	 * guest TLB operation clear HCR_TGE to target TLB operations
+	 * to EL1&0 (guest).
+	 */
+	hcr &= ~HCR_TGE;
+	write_sysreg(hcr, hcr_el2);
+}
+
+static void __hyp_text __switch_to_host_role_vhe(void)
+{
+	u64 hcr = read_sysreg(hcr_el2);
+
+	hcr |= HCR_TGE;
+	write_sysreg(hcr, hcr_el2);
+}
+
+static hyp_alternate_select(__switch_to_hyp_role,
+			    __switch_to_hyp_role_nvhe,
+			    __switch_to_hyp_role_vhe,
+			    ARM64_HAS_VIRT_HOST_EXTN);
+
+static hyp_alternate_select(__switch_to_host_role,
+			    __switch_to_host_role_nvhe,
+			    __switch_to_host_role_vhe,
+			    ARM64_HAS_VIRT_HOST_EXTN);
+
+static void __hyp_text __switch_to_guest_regime(struct kvm *kvm)
+{
+	write_sysreg(kvm->arch.vttbr, vttbr_el2);
+	__switch_to_hyp_role();
+	isb();
+}
+
+static void __hyp_text __switch_to_host_regime(void)
+{
+	__switch_to_host_role();
+	write_sysreg(0, vttbr_el2);
+}
+
+void __hyp_text
+__kvm_emulate_tlb_invalidate(struct kvm *kvm, u32 opcode, u64 regval)
+{
+	kvm = kern_hyp_va(kvm);
+
+	/*
+	 * Switch to the guest before performing any TLB operations to
+	 * target the appropriate VMID
+	 */
+	__switch_to_guest_regime(kvm);
+
+	/*
+	 *  TLB maintenance operations are broadcast to
+	 *  inner-shareable domain when HCR_FB is set (default for
+	 *  KVM).
+	 *
+	 *  Nuke all Stage 1 TLB entries for the VM. This will kill
+	 *  performance but it's always safe to do as we don't leave
+	 *  behind any strays in the TLB
+	 */
+	__tlbi(vmalle1is);
+	isb();
+
+	__switch_to_host_regime();
+}
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87e7e6608cd8..12ff8cf9f18a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -791,6 +791,20 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 	return true;
 }
 
+static bool emulate_tlb_invalidate(struct kvm_vcpu *vcpu,
+				   struct sys_reg_params *p,
+				   const struct sys_reg_desc *r)
+{
+	u32 opcode = sys_reg(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2);
+
+	kvm_call_hyp(__kvm_emulate_tlb_invalidate,
+		     vcpu->kvm, opcode, p->regval);
+	trace_kvm_tlb_invalidate(*vcpu_pc(vcpu), opcode);
+	++vcpu->stat.tlb_invalidate;
+
+	return true;
+}
+
 /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
 #define DBG_BCR_BVR_WCR_WVR_EL1(n)					\
 	/* DBGBVRn_EL1 */						\
@@ -842,6 +856,35 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
 	  access_dcsw },
 
+	/*
+	 * ARMv8 ARM: Table C5-4 TLB maintenance instructions
+	 * (Ref: ARMv8 ARM C5.1 version: ARM DDI 0487A.j)
+	 */
+	/* TLBI VMALLE1IS */
+	{ Op0(1), Op1(0), CRn(8), CRm(3), Op2(0), emulate_tlb_invalidate },
+	/* TLBI VAE1IS */
+	{ Op0(1), Op1(0), CRn(8), CRm(3), Op2(1), emulate_tlb_invalidate },
+	/* TLBI ASIDE1IS */
+	{ Op0(1), Op1(0), CRn(8), CRm(3), Op2(2), emulate_tlb_invalidate },
+	/* TLBI VAAE1IS */
+	{ Op0(1), Op1(0), CRn(8), CRm(3), Op2(3), emulate_tlb_invalidate },
+	/* TLBI VALE1IS */
+	{ Op0(1), Op1(0), CRn(8), CRm(3), Op2(5), emulate_tlb_invalidate },
+	/* TLBI VAALE1IS */
+	{ Op0(1), Op1(0), CRn(8), CRm(3), Op2(7), emulate_tlb_invalidate },
+	/* TLBI VMALLE1 */
+	{ Op0(1), Op1(0), CRn(8), CRm(7), Op2(0), emulate_tlb_invalidate },
+	/* TLBI VAE1 */
+	{ Op0(1), Op1(0), CRn(8), CRm(7), Op2(1), emulate_tlb_invalidate },
+	/* TLBI ASIDE1 */
+	{ Op0(1), Op1(0), CRn(8), CRm(7), Op2(2), emulate_tlb_invalidate },
+	/* TLBI VAAE1 */
+	{ Op0(1), Op1(0), CRn(8), CRm(7), Op2(3), emulate_tlb_invalidate },
+	/* TLBI VALE1 */
+	{ Op0(1), Op1(0), CRn(8), CRm(7), Op2(5), emulate_tlb_invalidate },
+	/* TLBI VAALE1 */
+	{ Op0(1), Op1(0), CRn(8), CRm(7), Op2(7), emulate_tlb_invalidate },
+
 	DBG_BCR_BVR_WCR_WVR_EL1(0),
 	DBG_BCR_BVR_WCR_WVR_EL1(1),
 	/* MDCCINT_EL1 */
@@ -1330,6 +1373,46 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn( 7), CRm(10), Op2( 2), access_dcsw },
 	{ Op1( 0), CRn( 7), CRm(14), Op2( 2), access_dcsw },
 
+	/*
+	 * TLB operations
+	 */
+	/* TLBIALLIS */
+	{ Op1( 0), CRn( 8), CRm( 3), Op2( 0), emulate_tlb_invalidate},
+	/* TLBIMVAIS */
+	{ Op1( 0), CRn( 8), CRm( 3), Op2( 1), emulate_tlb_invalidate},
+	/* TLBIASIDIS */
+	{ Op1( 0), CRn( 8), CRm( 3), Op2( 2), emulate_tlb_invalidate},
+	/* TLBIMVAAIS */
+	{ Op1( 0), CRn( 8), CRm( 3), Op2( 3), emulate_tlb_invalidate},
+	/* TLBIMVALIS */
+	{ Op1( 0), CRn( 8), CRm( 3), Op2( 5), emulate_tlb_invalidate},
+	/* TLBIMVAALIS */
+	{ Op1( 0), CRn( 8), CRm( 3), Op2( 7), emulate_tlb_invalidate},
+	/* ITLBIALL */
+	{ Op1( 0), CRn( 8), CRm( 5), Op2( 0), emulate_tlb_invalidate},
+	/* ITLBIMVA */
+	{ Op1( 0), CRn( 8), CRm( 5), Op2( 1), emulate_tlb_invalidate},
+	/* ITLBIASID */
+	{ Op1( 0), CRn( 8), CRm( 5), Op2( 2), emulate_tlb_invalidate},
+	/* DTLBIALL */
+	{ Op1( 0), CRn( 8), CRm( 6), Op2( 0), emulate_tlb_invalidate},
+	/* DTLBIMVA */
+	{ Op1( 0), CRn( 8), CRm( 6), Op2( 1), emulate_tlb_invalidate},
+	/* DTLBIASID */
+	{ Op1( 0), CRn( 8), CRm( 6), Op2( 2), emulate_tlb_invalidate},
+	/* TLBIALL */
+	{ Op1( 0), CRn( 8), CRm( 7), Op2( 0), emulate_tlb_invalidate},
+	/* TLBIMVA */
+	{ Op1( 0), CRn( 8), CRm( 7), Op2( 1), emulate_tlb_invalidate},
+	/* TLBIASID */
+	{ Op1( 0), CRn( 8), CRm( 7), Op2( 2), emulate_tlb_invalidate},
+	/* TLBIMVAA */
+	{ Op1( 0), CRn( 8), CRm( 7), Op2( 3), emulate_tlb_invalidate},
+	/* TLBIMVAL */
+	{ Op1( 0), CRn( 8), CRm( 7), Op2( 5), emulate_tlb_invalidate},
+	/* TLBIMVAAL */
+	{ Op1( 0), CRn( 8), CRm( 7), Op2( 7), emulate_tlb_invalidate},
+
 	/* PMU */
 	{ Op1( 0), CRn( 9), CRm(12), Op2( 0), access_pmcr },
 	{ Op1( 0), CRn( 9), CRm(12), Op2( 1), access_pmcnten },
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
                   ` (4 preceding siblings ...)
  2017-01-10 11:38 ` [PATCH v3 5/9] arm64: " Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-18 11:21   ` Marc Zyngier
  2017-01-10 11:38 ` [PATCH v3 7/9] kvm: host_pmu: Add support for tracking guest TLB operations Punit Agrawal
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Steven Rostedt, Peter Zijlstra, Will Deacon

Both AArch32 and AArch64 mode of the ARMv8 architecture support trapping
certain VM operations, e.g, TLB and cache maintenance
operations. Selective trapping of these operations for specific VMs can
be used to track the frequency with which these occur during execution.

Add a software PMU on the host that can support tracking VM
operations (in the form of PMU events). Supporting new events requires
providing callbacks to configure the VM to enable/disable the trapping
and read a count of the frequency.

The host PMU events can be controlled by tools like perf that use
standard kernel perf interfaces.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |   8 ++
 arch/arm/kvm/arm.c                |   2 +
 arch/arm64/include/asm/kvm_host.h |   8 ++
 virt/kvm/arm/host_pmu.c           | 272 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 290 insertions(+)
 create mode 100644 virt/kvm/arm/host_pmu.c

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 26f0c8a0b790..b988f8801b86 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -289,6 +289,14 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+#if !defined(CONFIG_KVM_HOST_PMU)
+static inline int kvm_host_pmu_init(void) { return 0; }
+static inline void kvm_host_pmu_teardown(void) { }
+#else
+int kvm_host_pmu_init(void);
+void kvm_host_pmu_teardown(void);
+#endif
+
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 11676787ad49..058626b65b8d 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1263,6 +1263,7 @@ static int init_subsystems(void)
 		goto out;
 
 	kvm_perf_init();
+	kvm_host_pmu_init();
 	kvm_coproc_table_init();
 
 out:
@@ -1453,6 +1454,7 @@ int kvm_arch_init(void *opaque)
 void kvm_arch_exit(void)
 {
 	kvm_perf_teardown();
+	kvm_host_pmu_teardown();
 }
 
 static int arm_init(void)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 1e83b707f14c..018f887e8964 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -349,6 +349,14 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+#if !defined(CONFIG_KVM_HOST_PMU)
+static inline int kvm_host_pmu_init(void) { return 0; }
+static inline void kvm_host_pmu_teardown(void) { }
+#else
+int kvm_host_pmu_init(void);
+void kvm_host_pmu_teardown(void);
+#endif
+
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
 static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
diff --git a/virt/kvm/arm/host_pmu.c b/virt/kvm/arm/host_pmu.c
new file mode 100644
index 000000000000..fc610ccc169a
--- /dev/null
+++ b/virt/kvm/arm/host_pmu.c
@@ -0,0 +1,272 @@
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/kvm_host.h>
+#include <linux/list.h>
+#include <linux/perf_event.h>
+#include <linux/pid.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/spinlock_types.h>
+#include <linux/sysfs.h>
+
+#include <asm/kvm_emulate.h>
+
+enum host_pmu_events {
+	KVM_HOST_MAX_EVENTS,
+};
+
+struct host_pmu {
+	struct pmu pmu;
+	spinlock_t event_list_lock;
+	struct list_head event_list_head;
+} host_pmu;
+#define to_host_pmu(p) (container_of(p, struct host_pmu, pmu))
+
+typedef void (*configure_event_fn)(struct kvm *kvm, bool enable);
+typedef u64 (*get_event_count_fn)(struct kvm *kvm);
+
+struct kvm_event_cb {
+	enum host_pmu_events event;
+	get_event_count_fn get_event_count;
+	configure_event_fn configure_event;
+};
+
+struct event_data {
+	bool enable;
+	struct kvm *kvm;
+	struct kvm_event_cb *cb;
+	struct work_struct work;
+	struct list_head event_list;
+};
+
+static struct kvm_event_cb event_callbacks[] = {
+};
+
+static struct attribute *event_attrs[] = {
+	NULL,
+};
+
+static struct attribute_group events_attr_group = {
+	.name	= "events",
+	.attrs	= event_attrs,
+};
+
+
+#define VM_MASK	GENMASK_ULL(31, 0)
+#define EVENT_MASK	GENMASK_ULL(32, 39)
+#define EVENT_SHIFT	(32)
+
+#define to_pid(cfg)	((cfg) & VM_MASK)
+#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
+
+PMU_FORMAT_ATTR(vm, "config:0-31");
+PMU_FORMAT_ATTR(event, "config:32-39");
+
+static struct attribute *format_attrs[] = {
+	&format_attr_vm.attr,
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group format_attr_group = {
+	.name	= "format",
+	.attrs	= format_attrs,
+};
+
+static const struct attribute_group *attr_groups[] = {
+	&events_attr_group,
+	&format_attr_group,
+	NULL,
+};
+
+static void host_event_destroy(struct perf_event *event)
+{
+	struct host_pmu *host_pmu = to_host_pmu(event->pmu);
+	struct event_data *e_data = event->pmu_private;
+
+	/*
+	 * Ensure outstanding work items related to this event are
+	 * completed before freeing resources.
+	 */
+	flush_work(&e_data->work);
+
+	kvm_put_kvm(e_data->kvm);
+
+	spin_lock(&host_pmu->event_list_lock);
+	list_del(&e_data->event_list);
+	spin_unlock(&host_pmu->event_list_lock);
+	kfree(e_data);
+}
+
+void host_event_work(struct work_struct *work)
+{
+	struct event_data *e_data = container_of(work, struct event_data, work);
+	struct kvm *kvm = e_data->kvm;
+
+	e_data->cb->configure_event(kvm, e_data->enable);
+}
+
+static int host_event_init(struct perf_event *event)
+{
+	struct host_pmu *host_pmu = to_host_pmu(event->pmu);
+	int event_id = to_event(event->attr.config);
+	pid_t task_pid = to_pid(event->attr.config);
+	struct event_data *e_data, *pos;
+	bool found = false;
+	struct pid *pid;
+	struct kvm *kvm;
+	int ret = 0;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	if (has_branch_stack(event)	||
+	    is_sampling_event(event)	||
+	    event->attr.exclude_user	||
+	    event->attr.exclude_kernel	||
+	    event->attr.exclude_hv	||
+	    event->attr.exclude_idle	||
+	    event->attr.exclude_guest) {
+		return -EINVAL;
+	}
+
+	if (event->attach_state == PERF_ATTACH_TASK)
+		return -EOPNOTSUPP;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	if (event_id >= KVM_HOST_MAX_EVENTS)
+		return -EINVAL;
+
+	pid = find_get_pid(task_pid);
+	spin_lock(&kvm_lock);
+	list_for_each_entry(kvm, &vm_list, vm_list) {
+		if (kvm->pid == pid) {
+			kvm_get_kvm(kvm);
+			found = true;
+			break;
+		}
+	}
+	spin_unlock(&kvm_lock);
+	put_pid(pid);
+
+	if (!found)
+		return -EINVAL;
+
+	spin_lock(&host_pmu->event_list_lock);
+	/* Make sure we don't already have the (event_id, kvm) pair */
+	list_for_each_entry(pos, &host_pmu->event_list_head, event_list) {
+		if (pos->cb->event == event_id &&
+		    pos->kvm->pid == pid) {
+			kvm_put_kvm(kvm);
+			ret = -EOPNOTSUPP;
+			goto unlock;
+		}
+	}
+
+	e_data = kzalloc(sizeof(*e_data), GFP_KERNEL);
+	e_data->kvm = kvm;
+	e_data->cb = &event_callbacks[event_id];
+	INIT_WORK(&e_data->work, host_event_work);
+	event->pmu_private = e_data;
+	event->cpu = cpumask_first(cpu_online_mask);
+	event->destroy = host_event_destroy;
+
+	list_add_tail(&e_data->event_list, &host_pmu->event_list_head);
+
+unlock:
+	spin_unlock(&host_pmu->event_list_lock);
+
+	return ret;
+}
+
+static void host_event_update(struct perf_event *event)
+{
+	struct event_data *e_data = event->pmu_private;
+	struct kvm_event_cb *cb = e_data->cb;
+	struct kvm *kvm = e_data->kvm;
+	struct hw_perf_event *hw = &event->hw;
+	u64 prev_count, new_count;
+
+	do {
+		prev_count = local64_read(&hw->prev_count);
+		new_count = cb->get_event_count(kvm);
+	} while (local64_xchg(&hw->prev_count, new_count) != prev_count);
+
+	local64_add(new_count - prev_count, &event->count);
+}
+
+static void host_event_start(struct perf_event *event, int flags)
+{
+	struct event_data *e_data = event->pmu_private;
+	struct kvm_event_cb *cb = e_data->cb;
+	struct kvm *kvm = e_data->kvm;
+	u64 val;
+
+	val = cb->get_event_count(kvm);
+	local64_set(&event->hw.prev_count, val);
+
+	e_data->enable = true;
+	schedule_work(&e_data->work);
+}
+
+static void host_event_stop(struct perf_event *event, int flags)
+{
+	struct event_data *e_data = event->pmu_private;
+
+	e_data->enable = false;
+	schedule_work(&e_data->work);
+
+	if (flags & PERF_EF_UPDATE)
+		host_event_update(event);
+}
+
+static int host_event_add(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_START)
+		host_event_start(event, flags);
+
+	return 0;
+}
+
+static void host_event_del(struct perf_event *event, int flags)
+{
+	host_event_stop(event, PERF_EF_UPDATE);
+}
+
+static void host_event_read(struct perf_event *event)
+{
+	host_event_update(event);
+}
+
+static void init_host_pmu(struct host_pmu *host_pmu)
+{
+	host_pmu->pmu = (struct pmu) {
+		.task_ctx_nr	= perf_sw_context,
+		.attr_groups	= attr_groups,
+		.event_init	= host_event_init,
+		.add		= host_event_add,
+		.del		= host_event_del,
+		.start		= host_event_start,
+		.stop		= host_event_stop,
+		.read		= host_event_read,
+		.capabilities	= PERF_PMU_CAP_NO_INTERRUPT,
+	};
+
+	INIT_LIST_HEAD(&host_pmu->event_list_head);
+	spin_lock_init(&host_pmu->event_list_lock);
+}
+
+int kvm_host_pmu_init(void)
+{
+	init_host_pmu(&host_pmu);
+
+	return perf_pmu_register(&host_pmu.pmu, "kvm", -1);
+}
+
+void kvm_host_pmu_teardown(void)
+{
+	perf_pmu_unregister(&host_pmu.pmu);
+}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 7/9] kvm: host_pmu: Add support for tracking guest TLB operations
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
                   ` (5 preceding siblings ...)
  2017-01-10 11:38 ` [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 8/9] arm: KVM: Enable support for host pmu Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 9/9] arm64: KVM: Enable support for the " Punit Agrawal
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Steven Rostedt, Peter Zijlstra, Will Deacon

Add the callbacks required by host PMU to support monitoring guest TLB
operations.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/host_pmu.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/virt/kvm/arm/host_pmu.c b/virt/kvm/arm/host_pmu.c
index fc610ccc169a..22c3aef17ec4 100644
--- a/virt/kvm/arm/host_pmu.c
+++ b/virt/kvm/arm/host_pmu.c
@@ -13,6 +13,7 @@
 #include <asm/kvm_emulate.h>
 
 enum host_pmu_events {
+	tlb_invalidate,
 	KVM_HOST_MAX_EVENTS,
 };
 
@@ -40,10 +41,59 @@ struct event_data {
 	struct list_head event_list;
 };
 
+static u64 get_tlb_invalidate_count(struct kvm *kvm)
+{
+	struct kvm_vcpu *vcpu;
+	u64 val = 0;
+	int i;
+
+	kvm_for_each_vcpu(i, vcpu, kvm)
+		val += vcpu->stat.tlb_invalidate;
+
+	return val;
+}
+
+static void configure_tlb_invalidate(struct kvm *kvm, bool enable)
+{
+	struct kvm_vcpu *vcpu;
+	int i;
+
+	kvm_arm_halt_guest(kvm);
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		unsigned long hcr = vcpu_get_hcr(vcpu);
+
+		if (enable)
+			hcr |= HCR_TTLB;
+		else
+			hcr &= ~HCR_TTLB;
+
+		vcpu_set_hcr(vcpu, hcr);
+	}
+	kvm_arm_resume_guest(kvm);
+}
+
 static struct kvm_event_cb event_callbacks[] = {
+	{
+		.event			= tlb_invalidate,
+		.get_event_count	= get_tlb_invalidate_count,
+		.configure_event	= configure_tlb_invalidate,
+	}
 };
 
+static ssize_t events_sysfs_show(struct device *dev,
+				 struct device_attribute *attr, char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr;
+
+	pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+
+	return sprintf(page, "event=0x%03llx,vm=?\n", pmu_attr->id);
+}
+PMU_EVENT_ATTR(tlb_invalidate, event_attr_tlb_invalidate, tlb_invalidate,
+	       events_sysfs_show);
+
 static struct attribute *event_attrs[] = {
+	&event_attr_tlb_invalidate.attr.attr,
 	NULL,
 };
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 8/9] arm: KVM: Enable support for host pmu
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
                   ` (6 preceding siblings ...)
  2017-01-10 11:38 ` [PATCH v3 7/9] kvm: host_pmu: Add support for tracking guest TLB operations Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  2017-01-10 11:38 ` [PATCH v3 9/9] arm64: KVM: Enable support for the " Punit Agrawal
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Will Deacon, Russell King

Add the Kconfig option and Makefile updates to enable the recently added
support for host pmu.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
---
 arch/arm/kvm/Kconfig  | 4 ++++
 arch/arm/kvm/Makefile | 1 +
 2 files changed, 5 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 90d0176fb30d..198d16c36220 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -16,6 +16,9 @@ menuconfig VIRTUALIZATION
 
 if VIRTUALIZATION
 
+config KVM_HOST_PMU
+	bool
+
 config KVM
 	bool "Kernel-based Virtual Machine (KVM) support"
 	depends on MMU && OF
@@ -30,6 +33,7 @@ config KVM
 	select SRCU
 	select MMU_NOTIFIER
 	select KVM_VFIO
+	select KVM_HOST_PMU if PERF_EVENTS
 	select HAVE_KVM_EVENTFD
 	select HAVE_KVM_IRQFD
 	select HAVE_KVM_IRQCHIP
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index d571243ab4d1..09d358499ce1 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -35,3 +35,4 @@ obj-y += $(KVM)/arm/vgic/vgic-kvm-device.o
 obj-y += $(KVM)/arm/vgic/vgic-its.o
 obj-y += $(KVM)/irqchip.o
 obj-y += $(KVM)/arm/arch_timer.o
+obj-$(CONFIG_KVM_HOST_PMU) += $(KVM)/arm/host_pmu.o
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 9/9] arm64: KVM: Enable support for the host pmu
  2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
                   ` (7 preceding siblings ...)
  2017-01-10 11:38 ` [PATCH v3 8/9] arm: KVM: Enable support for host pmu Punit Agrawal
@ 2017-01-10 11:38 ` Punit Agrawal
  8 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-10 11:38 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Punit Agrawal, linux-kernel, kvm, Christoffer Dall, Marc Zyngier,
	Will Deacon

Add the Kconfig option and Makefile update the enable the recently added
host pmu.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/Kconfig  | 4 ++++
 arch/arm64/kvm/Makefile | 1 +
 2 files changed, 5 insertions(+)

diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 52cb7ad9b2fd..c147a3077dab 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -16,6 +16,9 @@ menuconfig VIRTUALIZATION
 
 if VIRTUALIZATION
 
+config KVM_HOST_PMU
+        bool
+
 config KVM
 	bool "Kernel-based Virtual Machine (KVM) support"
 	depends on OF
@@ -31,6 +34,7 @@ config KVM
 	select KVM_VFIO
 	select HAVE_KVM_EVENTFD
 	select HAVE_KVM_IRQFD
+	select KVM_HOST_PMU if PERF_EVENTS
 	select KVM_ARM_PMU if HW_PERF_EVENTS
 	select HAVE_KVM_MSI
 	select HAVE_KVM_IRQCHIP
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index d50a82a16ff6..3bdac93784e2 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -34,3 +34,4 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-its.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/irqchip.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
 kvm-$(CONFIG_KVM_ARM_PMU) += $(KVM)/arm/pmu.o
+kvm-$(CONFIG_KVM_HOST_PMU) += $(KVM)/arm/host_pmu.o
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-10 11:38 ` [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection Punit Agrawal
@ 2017-01-18 11:21   ` Marc Zyngier
  2017-01-18 11:35     ` Mark Rutland
  2017-01-18 13:06     ` Punit Agrawal
  0 siblings, 2 replies; 23+ messages in thread
From: Marc Zyngier @ 2017-01-18 11:21 UTC (permalink / raw)
  To: Punit Agrawal, kvmarm, linux-arm-kernel
  Cc: linux-kernel, kvm, Christoffer Dall, Steven Rostedt,
	Peter Zijlstra, Will Deacon, Mark Rutland

+Mark

On 10/01/17 11:38, Punit Agrawal wrote:
> Both AArch32 and AArch64 mode of the ARMv8 architecture support trapping
> certain VM operations, e.g, TLB and cache maintenance
> operations. Selective trapping of these operations for specific VMs can
> be used to track the frequency with which these occur during execution.
> 
> Add a software PMU on the host that can support tracking VM
> operations (in the form of PMU events). Supporting new events requires
> providing callbacks to configure the VM to enable/disable the trapping
> and read a count of the frequency.
> 
> The host PMU events can be controlled by tools like perf that use
> standard kernel perf interfaces.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   |   8 ++
>  arch/arm/kvm/arm.c                |   2 +
>  arch/arm64/include/asm/kvm_host.h |   8 ++
>  virt/kvm/arm/host_pmu.c           | 272 ++++++++++++++++++++++++++++++++++++++
>  4 files changed, 290 insertions(+)
>  create mode 100644 virt/kvm/arm/host_pmu.c
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 26f0c8a0b790..b988f8801b86 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -289,6 +289,14 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> +#if !defined(CONFIG_KVM_HOST_PMU)
> +static inline int kvm_host_pmu_init(void) { return 0; }
> +static inline void kvm_host_pmu_teardown(void) { }
> +#else
> +int kvm_host_pmu_init(void);
> +void kvm_host_pmu_teardown(void);
> +#endif
> +
>  void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>  
>  struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 11676787ad49..058626b65b8d 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -1263,6 +1263,7 @@ static int init_subsystems(void)
>  		goto out;
>  
>  	kvm_perf_init();
> +	kvm_host_pmu_init();
>  	kvm_coproc_table_init();
>  
>  out:
> @@ -1453,6 +1454,7 @@ int kvm_arch_init(void *opaque)
>  void kvm_arch_exit(void)
>  {
>  	kvm_perf_teardown();
> +	kvm_host_pmu_teardown();
>  }
>  
>  static int arm_init(void)
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 1e83b707f14c..018f887e8964 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -349,6 +349,14 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> +#if !defined(CONFIG_KVM_HOST_PMU)
> +static inline int kvm_host_pmu_init(void) { return 0; }
> +static inline void kvm_host_pmu_teardown(void) { }
> +#else
> +int kvm_host_pmu_init(void);
> +void kvm_host_pmu_teardown(void);
> +#endif
> +
>  struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  
>  static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
> diff --git a/virt/kvm/arm/host_pmu.c b/virt/kvm/arm/host_pmu.c
> new file mode 100644
> index 000000000000..fc610ccc169a
> --- /dev/null
> +++ b/virt/kvm/arm/host_pmu.c
> @@ -0,0 +1,272 @@
> +#include <linux/cpumask.h>
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/kvm_host.h>
> +#include <linux/list.h>
> +#include <linux/perf_event.h>
> +#include <linux/pid.h>
> +#include <linux/sched.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock_types.h>
> +#include <linux/sysfs.h>
> +
> +#include <asm/kvm_emulate.h>
> +
> +enum host_pmu_events {
> +	KVM_HOST_MAX_EVENTS,
> +};
> +
> +struct host_pmu {
> +	struct pmu pmu;
> +	spinlock_t event_list_lock;
> +	struct list_head event_list_head;
> +} host_pmu;
> +#define to_host_pmu(p) (container_of(p, struct host_pmu, pmu))
> +
> +typedef void (*configure_event_fn)(struct kvm *kvm, bool enable);
> +typedef u64 (*get_event_count_fn)(struct kvm *kvm);
> +
> +struct kvm_event_cb {
> +	enum host_pmu_events event;
> +	get_event_count_fn get_event_count;
> +	configure_event_fn configure_event;
> +};
> +
> +struct event_data {
> +	bool enable;
> +	struct kvm *kvm;
> +	struct kvm_event_cb *cb;
> +	struct work_struct work;
> +	struct list_head event_list;
> +};
> +
> +static struct kvm_event_cb event_callbacks[] = {
> +};
> +
> +static struct attribute *event_attrs[] = {
> +	NULL,
> +};
> +
> +static struct attribute_group events_attr_group = {
> +	.name	= "events",
> +	.attrs	= event_attrs,
> +};
> +
> +
> +#define VM_MASK	GENMASK_ULL(31, 0)
> +#define EVENT_MASK	GENMASK_ULL(32, 39)
> +#define EVENT_SHIFT	(32)
> +
> +#define to_pid(cfg)	((cfg) & VM_MASK)
> +#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
> +
> +PMU_FORMAT_ATTR(vm, "config:0-31");
> +PMU_FORMAT_ATTR(event, "config:32-39");

I'm a bit confused by these. Can't you get the PID of the VM you're
tracing directly from perf, without having to encode things? And if you
can't, surely this should be a function of the size of pid_t?

Mark, can you shine some light here?

> +
> +static struct attribute *format_attrs[] = {
> +	&format_attr_vm.attr,
> +	&format_attr_event.attr,
> +	NULL,
> +};
> +
> +static struct attribute_group format_attr_group = {
> +	.name	= "format",
> +	.attrs	= format_attrs,
> +};
> +
> +static const struct attribute_group *attr_groups[] = {
> +	&events_attr_group,
> +	&format_attr_group,
> +	NULL,
> +};
> +
> +static void host_event_destroy(struct perf_event *event)
> +{
> +	struct host_pmu *host_pmu = to_host_pmu(event->pmu);
> +	struct event_data *e_data = event->pmu_private;
> +
> +	/*
> +	 * Ensure outstanding work items related to this event are
> +	 * completed before freeing resources.
> +	 */
> +	flush_work(&e_data->work);
> +
> +	kvm_put_kvm(e_data->kvm);
> +
> +	spin_lock(&host_pmu->event_list_lock);
> +	list_del(&e_data->event_list);
> +	spin_unlock(&host_pmu->event_list_lock);
> +	kfree(e_data);
> +}
> +
> +void host_event_work(struct work_struct *work)
> +{
> +	struct event_data *e_data = container_of(work, struct event_data, work);
> +	struct kvm *kvm = e_data->kvm;
> +
> +	e_data->cb->configure_event(kvm, e_data->enable);
> +}
> +
> +static int host_event_init(struct perf_event *event)
> +{
> +	struct host_pmu *host_pmu = to_host_pmu(event->pmu);
> +	int event_id = to_event(event->attr.config);
> +	pid_t task_pid = to_pid(event->attr.config);
> +	struct event_data *e_data, *pos;
> +	bool found = false;
> +	struct pid *pid;
> +	struct kvm *kvm;
> +	int ret = 0;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	if (has_branch_stack(event)	||
> +	    is_sampling_event(event)	||
> +	    event->attr.exclude_user	||
> +	    event->attr.exclude_kernel	||
> +	    event->attr.exclude_hv	||
> +	    event->attr.exclude_idle	||
> +	    event->attr.exclude_guest) {
> +		return -EINVAL;
> +	}
> +
> +	if (event->attach_state == PERF_ATTACH_TASK)
> +		return -EOPNOTSUPP;
> +
> +	if (event->cpu < 0)
> +		return -EINVAL;
> +
> +	if (event_id >= KVM_HOST_MAX_EVENTS)
> +		return -EINVAL;
> +
> +	pid = find_get_pid(task_pid);
> +	spin_lock(&kvm_lock);
> +	list_for_each_entry(kvm, &vm_list, vm_list) {
> +		if (kvm->pid == pid) {
> +			kvm_get_kvm(kvm);
> +			found = true;
> +			break;
> +		}
> +	}
> +	spin_unlock(&kvm_lock);
> +	put_pid(pid);
> +
> +	if (!found)
> +		return -EINVAL;
> +
> +	spin_lock(&host_pmu->event_list_lock);
> +	/* Make sure we don't already have the (event_id, kvm) pair */
> +	list_for_each_entry(pos, &host_pmu->event_list_head, event_list) {
> +		if (pos->cb->event == event_id &&
> +		    pos->kvm->pid == pid) {
> +			kvm_put_kvm(kvm);
> +			ret = -EOPNOTSUPP;
> +			goto unlock;
> +		}
> +	}
> +
> +	e_data = kzalloc(sizeof(*e_data), GFP_KERNEL);
> +	e_data->kvm = kvm;
> +	e_data->cb = &event_callbacks[event_id];
> +	INIT_WORK(&e_data->work, host_event_work);
> +	event->pmu_private = e_data;
> +	event->cpu = cpumask_first(cpu_online_mask);
> +	event->destroy = host_event_destroy;
> +
> +	list_add_tail(&e_data->event_list, &host_pmu->event_list_head);
> +
> +unlock:
> +	spin_unlock(&host_pmu->event_list_lock);
> +
> +	return ret;
> +}
> +
> +static void host_event_update(struct perf_event *event)
> +{
> +	struct event_data *e_data = event->pmu_private;
> +	struct kvm_event_cb *cb = e_data->cb;
> +	struct kvm *kvm = e_data->kvm;
> +	struct hw_perf_event *hw = &event->hw;
> +	u64 prev_count, new_count;
> +
> +	do {
> +		prev_count = local64_read(&hw->prev_count);
> +		new_count = cb->get_event_count(kvm);
> +	} while (local64_xchg(&hw->prev_count, new_count) != prev_count);
> +
> +	local64_add(new_count - prev_count, &event->count);
> +}
> +
> +static void host_event_start(struct perf_event *event, int flags)
> +{
> +	struct event_data *e_data = event->pmu_private;
> +	struct kvm_event_cb *cb = e_data->cb;
> +	struct kvm *kvm = e_data->kvm;
> +	u64 val;
> +
> +	val = cb->get_event_count(kvm);
> +	local64_set(&event->hw.prev_count, val);
> +
> +	e_data->enable = true;
> +	schedule_work(&e_data->work);
> +}
> +
> +static void host_event_stop(struct perf_event *event, int flags)
> +{
> +	struct event_data *e_data = event->pmu_private;
> +
> +	e_data->enable = false;
> +	schedule_work(&e_data->work);
> +
> +	if (flags & PERF_EF_UPDATE)
> +		host_event_update(event);
> +}
> +
> +static int host_event_add(struct perf_event *event, int flags)
> +{
> +	if (flags & PERF_EF_START)
> +		host_event_start(event, flags);
> +
> +	return 0;
> +}
> +
> +static void host_event_del(struct perf_event *event, int flags)
> +{
> +	host_event_stop(event, PERF_EF_UPDATE);
> +}
> +
> +static void host_event_read(struct perf_event *event)
> +{
> +	host_event_update(event);
> +}
> +
> +static void init_host_pmu(struct host_pmu *host_pmu)
> +{
> +	host_pmu->pmu = (struct pmu) {
> +		.task_ctx_nr	= perf_sw_context,
> +		.attr_groups	= attr_groups,
> +		.event_init	= host_event_init,
> +		.add		= host_event_add,
> +		.del		= host_event_del,
> +		.start		= host_event_start,
> +		.stop		= host_event_stop,
> +		.read		= host_event_read,
> +		.capabilities	= PERF_PMU_CAP_NO_INTERRUPT,
> +	};
> +
> +	INIT_LIST_HEAD(&host_pmu->event_list_head);
> +	spin_lock_init(&host_pmu->event_list_lock);
> +}
> +
> +int kvm_host_pmu_init(void)
> +{
> +	init_host_pmu(&host_pmu);
> +
> +	return perf_pmu_register(&host_pmu.pmu, "kvm", -1);
> +}
> +
> +void kvm_host_pmu_teardown(void)
> +{
> +	perf_pmu_unregister(&host_pmu.pmu);
> +}
> 

This patch really makes me think that there is nothing arm-specific in
here at all. Why can't it be a generic feature through which
architectures can expose events in a generic way (or as close as
possible to being generic)?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 11:21   ` Marc Zyngier
@ 2017-01-18 11:35     ` Mark Rutland
  2017-01-18 13:01       ` Punit Agrawal
  2017-01-18 13:06     ` Punit Agrawal
  1 sibling, 1 reply; 23+ messages in thread
From: Mark Rutland @ 2017-01-18 11:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Punit Agrawal, kvmarm, linux-arm-kernel, linux-kernel, kvm,
	Christoffer Dall, Steven Rostedt, Peter Zijlstra, Will Deacon

On Wed, Jan 18, 2017 at 11:21:21AM +0000, Marc Zyngier wrote:
> On 10/01/17 11:38, Punit Agrawal wrote:
> > +#define VM_MASK	GENMASK_ULL(31, 0)
> > +#define EVENT_MASK	GENMASK_ULL(32, 39)
> > +#define EVENT_SHIFT	(32)
> > +
> > +#define to_pid(cfg)	((cfg) & VM_MASK)
> > +#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
> > +
> > +PMU_FORMAT_ATTR(vm, "config:0-31");
> > +PMU_FORMAT_ATTR(event, "config:32-39");
> 
> I'm a bit confused by these. Can't you get the PID of the VM you're
> tracing directly from perf, without having to encode things? And if you
> can't, surely this should be a function of the size of pid_t?
>
> Mark, can you shine some light here?

AFAICT, this is not necessary.

The perf_event_open() syscall takes a PID separately from the
perf_event_attr. i.e. we should be able to do:

// monitor a particular vCPU
perf_event_open(attr, vcpupid, -1, -1, 0)

... or .. 

// monitor a particular vCPU on a pCPU
perf_event_open(attr, vcpupid, cpu, -1, 0)

... or ...

// monitor all vCPUs on a pCPU
perf_event_open(attr, -1, cpu, -1, 0)

... so this shouldn't be necessary. AFAICT, this is a SW PMU, so there
should be no issue with using the perf_sw_context.

If this is a bodge to avoid opening a perf_event per vCPU thread, then I
completely disagree with the approach. This would be better handled in
userspace by discovering the set of threads and opening events for each.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 11:35     ` Mark Rutland
@ 2017-01-18 13:01       ` Punit Agrawal
  2017-01-18 13:18         ` Marc Zyngier
  2017-01-18 13:45         ` Will Deacon
  0 siblings, 2 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-18 13:01 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marc Zyngier, kvm, Peter Zijlstra, Will Deacon, linux-kernel,
	Steven Rostedt, linux-arm-kernel, kvmarm

Mark Rutland <mark.rutland@arm.com> writes:

> On Wed, Jan 18, 2017 at 11:21:21AM +0000, Marc Zyngier wrote:
>> On 10/01/17 11:38, Punit Agrawal wrote:
>> > +#define VM_MASK	GENMASK_ULL(31, 0)
>> > +#define EVENT_MASK	GENMASK_ULL(32, 39)
>> > +#define EVENT_SHIFT	(32)
>> > +
>> > +#define to_pid(cfg)	((cfg) & VM_MASK)
>> > +#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
>> > +
>> > +PMU_FORMAT_ATTR(vm, "config:0-31");
>> > +PMU_FORMAT_ATTR(event, "config:32-39");
>> 
>> I'm a bit confused by these. Can't you get the PID of the VM you're
>> tracing directly from perf, without having to encode things?

With perf attached to a PID, the event gets scheduled out when the task
is context switched. As the PID of the controlling process was used,
none of the vCPU events were counted.

> And if you
>> can't, surely this should be a function of the size of pid_t?

Agreed. I'll update above if we decide to carry on with this
approach. More below...

>>
>> Mark, can you shine some light here?
>
> AFAICT, this is not necessary.
>
> The perf_event_open() syscall takes a PID separately from the
> perf_event_attr. i.e. we should be able to do:
>
> // monitor a particular vCPU
> perf_event_open(attr, vcpupid, -1, -1, 0)
>
> ... or .. 
>
> // monitor a particular vCPU on a pCPU
> perf_event_open(attr, vcpupid, cpu, -1, 0)
>
> ... or ...
>
> // monitor all vCPUs on a pCPU
> perf_event_open(attr, -1, cpu, -1, 0)
>
> ... so this shouldn't be necessary. AFAICT, this is a SW PMU, so there
> should be no issue with using the perf_sw_context.

I might have missed it but none of the modes of invoking perf_event_open
allow monitoring a set of process, i.e., all vcpus belonging to a
particular VM, which was one of the aims and a feature I was carrying
over from the previous version. If we do not care about this...

>
> If this is a bodge to avoid opening a perf_event per vCPU thread, then I
> completely disagree with the approach. This would be better handled in
> userspace by discovering the set of threads and opening events for
> each.

... then requiring userspace to invoke perf_event_open perf vCPU will
simplify this patch.

Marc, any objections?

>
> Thanks,
> Mark.
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 11:21   ` Marc Zyngier
  2017-01-18 11:35     ` Mark Rutland
@ 2017-01-18 13:06     ` Punit Agrawal
  1 sibling, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-18 13:06 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Peter Zijlstra, Will Deacon,
	linux-kernel, Steven Rostedt

Hi Marc,

Marc Zyngier <marc.zyngier@arm.com> writes:

> +Mark
>
> On 10/01/17 11:38, Punit Agrawal wrote:
>> Both AArch32 and AArch64 mode of the ARMv8 architecture support trapping
>> certain VM operations, e.g, TLB and cache maintenance
>> operations. Selective trapping of these operations for specific VMs can
>> be used to track the frequency with which these occur during execution.
>> 
>> Add a software PMU on the host that can support tracking VM
>> operations (in the form of PMU events). Supporting new events requires
>> providing callbacks to configure the VM to enable/disable the trapping
>> and read a count of the frequency.
>> 
>> The host PMU events can be controlled by tools like perf that use
>> standard kernel perf interfaces.
>> 
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> ---
>>  arch/arm/include/asm/kvm_host.h   |   8 ++
>>  arch/arm/kvm/arm.c                |   2 +
>>  arch/arm64/include/asm/kvm_host.h |   8 ++
>>  virt/kvm/arm/host_pmu.c           | 272 ++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 290 insertions(+)
>>  create mode 100644 virt/kvm/arm/host_pmu.c
>> 

[...]

>> 
>
> This patch really makes me think that there is nothing arm-specific in
> here at all. Why can't it be a generic feature through which
> architectures can expose events in a generic way (or as close as
> possible to being generic)?

I wasn't sure if other architectures are interested in the functionality
- that's the only reason. Having said that, it can be turned off in the
config so there shouldn't be any complaints.

I'll move this to virt/kvm in the next incarnation.

Thanks for taking a look.

>
> Thanks,
>
> 	M.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 13:01       ` Punit Agrawal
@ 2017-01-18 13:18         ` Marc Zyngier
  2017-01-18 14:51           ` Punit Agrawal
  2017-01-18 13:45         ` Will Deacon
  1 sibling, 1 reply; 23+ messages in thread
From: Marc Zyngier @ 2017-01-18 13:18 UTC (permalink / raw)
  To: Punit Agrawal, Mark Rutland
  Cc: kvm, Peter Zijlstra, Will Deacon, linux-kernel, Steven Rostedt,
	linux-arm-kernel, kvmarm

On 18/01/17 13:01, Punit Agrawal wrote:
> Mark Rutland <mark.rutland@arm.com> writes:
> 
>> On Wed, Jan 18, 2017 at 11:21:21AM +0000, Marc Zyngier wrote:
>>> On 10/01/17 11:38, Punit Agrawal wrote:
>>>> +#define VM_MASK	GENMASK_ULL(31, 0)
>>>> +#define EVENT_MASK	GENMASK_ULL(32, 39)
>>>> +#define EVENT_SHIFT	(32)
>>>> +
>>>> +#define to_pid(cfg)	((cfg) & VM_MASK)
>>>> +#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
>>>> +
>>>> +PMU_FORMAT_ATTR(vm, "config:0-31");
>>>> +PMU_FORMAT_ATTR(event, "config:32-39");
>>>
>>> I'm a bit confused by these. Can't you get the PID of the VM you're
>>> tracing directly from perf, without having to encode things?
> 
> With perf attached to a PID, the event gets scheduled out when the task
> is context switched. As the PID of the controlling process was used,
> none of the vCPU events were counted.
> 
>> And if you
>>> can't, surely this should be a function of the size of pid_t?
> 
> Agreed. I'll update above if we decide to carry on with this
> approach. More below...
> 
>>>
>>> Mark, can you shine some light here?
>>
>> AFAICT, this is not necessary.
>>
>> The perf_event_open() syscall takes a PID separately from the
>> perf_event_attr. i.e. we should be able to do:
>>
>> // monitor a particular vCPU
>> perf_event_open(attr, vcpupid, -1, -1, 0)
>>
>> ... or .. 
>>
>> // monitor a particular vCPU on a pCPU
>> perf_event_open(attr, vcpupid, cpu, -1, 0)
>>
>> ... or ...
>>
>> // monitor all vCPUs on a pCPU
>> perf_event_open(attr, -1, cpu, -1, 0)
>>
>> ... so this shouldn't be necessary. AFAICT, this is a SW PMU, so there
>> should be no issue with using the perf_sw_context.
> 
> I might have missed it but none of the modes of invoking perf_event_open
> allow monitoring a set of process, i.e., all vcpus belonging to a
> particular VM, which was one of the aims and a feature I was carrying
> over from the previous version. If we do not care about this...
> 
>>
>> If this is a bodge to avoid opening a perf_event per vCPU thread, then I
>> completely disagree with the approach. This would be better handled in
>> userspace by discovering the set of threads and opening events for
>> each.
> 
> ... then requiring userspace to invoke perf_event_open perf vCPU will
> simplify this patch.
> 
> Marc, any objections?

Not so far, but I'm curious to find out how you determine which thread
is a vcpu, let alone a given vcpu.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 13:01       ` Punit Agrawal
  2017-01-18 13:18         ` Marc Zyngier
@ 2017-01-18 13:45         ` Will Deacon
  2017-01-18 14:58           ` Punit Agrawal
  1 sibling, 1 reply; 23+ messages in thread
From: Will Deacon @ 2017-01-18 13:45 UTC (permalink / raw)
  To: Punit Agrawal
  Cc: Mark Rutland, Marc Zyngier, kvm, Peter Zijlstra, linux-kernel,
	Steven Rostedt, linux-arm-kernel, kvmarm

On Wed, Jan 18, 2017 at 01:01:40PM +0000, Punit Agrawal wrote:
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On Wed, Jan 18, 2017 at 11:21:21AM +0000, Marc Zyngier wrote:
> >> On 10/01/17 11:38, Punit Agrawal wrote:
> >> > +#define VM_MASK	GENMASK_ULL(31, 0)
> >> > +#define EVENT_MASK	GENMASK_ULL(32, 39)
> >> > +#define EVENT_SHIFT	(32)
> >> > +
> >> > +#define to_pid(cfg)	((cfg) & VM_MASK)
> >> > +#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
> >> > +
> >> > +PMU_FORMAT_ATTR(vm, "config:0-31");
> >> > +PMU_FORMAT_ATTR(event, "config:32-39");
> >> 
> >> I'm a bit confused by these. Can't you get the PID of the VM you're
> >> tracing directly from perf, without having to encode things?
> 
> With perf attached to a PID, the event gets scheduled out when the task
> is context switched. As the PID of the controlling process was used,
> none of the vCPU events were counted.

So it sounds like userspace needs to deal with this by attaching to the PIDs
of the vCPUs. Given that perf kvm seems to have knowledge of vCPUs, it would
be nice to know why that logic isn't reusable here. Take a look in
tools/perf/builtin-kvm.c and if it's not up to the job, then perhaps it can
be improved.

Will

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 13:18         ` Marc Zyngier
@ 2017-01-18 14:51           ` Punit Agrawal
  2017-01-18 15:17             ` Mark Rutland
  0 siblings, 1 reply; 23+ messages in thread
From: Punit Agrawal @ 2017-01-18 14:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, kvm, Peter Zijlstra, Will Deacon, linux-kernel,
	Steven Rostedt, kvmarm, linux-arm-kernel

Marc Zyngier <marc.zyngier@arm.com> writes:

> On 18/01/17 13:01, Punit Agrawal wrote:
>> Mark Rutland <mark.rutland@arm.com> writes:
>> 
>>> On Wed, Jan 18, 2017 at 11:21:21AM +0000, Marc Zyngier wrote:
>>>> On 10/01/17 11:38, Punit Agrawal wrote:
>>>>> +#define VM_MASK	GENMASK_ULL(31, 0)
>>>>> +#define EVENT_MASK	GENMASK_ULL(32, 39)
>>>>> +#define EVENT_SHIFT	(32)
>>>>> +
>>>>> +#define to_pid(cfg)	((cfg) & VM_MASK)
>>>>> +#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
>>>>> +
>>>>> +PMU_FORMAT_ATTR(vm, "config:0-31");
>>>>> +PMU_FORMAT_ATTR(event, "config:32-39");
>>>>
>>>> I'm a bit confused by these. Can't you get the PID of the VM you're
>>>> tracing directly from perf, without having to encode things?
>> 
>> With perf attached to a PID, the event gets scheduled out when the task
>> is context switched. As the PID of the controlling process was used,
>> none of the vCPU events were counted.
>> 
>>> And if you
>>>> can't, surely this should be a function of the size of pid_t?
>> 
>> Agreed. I'll update above if we decide to carry on with this
>> approach. More below...
>> 
>>>>
>>>> Mark, can you shine some light here?
>>>
>>> AFAICT, this is not necessary.
>>>
>>> The perf_event_open() syscall takes a PID separately from the
>>> perf_event_attr. i.e. we should be able to do:
>>>
>>> // monitor a particular vCPU
>>> perf_event_open(attr, vcpupid, -1, -1, 0)
>>>
>>> ... or .. 
>>>
>>> // monitor a particular vCPU on a pCPU
>>> perf_event_open(attr, vcpupid, cpu, -1, 0)
>>>
>>> ... or ...
>>>
>>> // monitor all vCPUs on a pCPU
>>> perf_event_open(attr, -1, cpu, -1, 0)
>>>
>>> ... so this shouldn't be necessary. AFAICT, this is a SW PMU, so there
>>> should be no issue with using the perf_sw_context.
>> 
>> I might have missed it but none of the modes of invoking perf_event_open
>> allow monitoring a set of process, i.e., all vcpus belonging to a
>> particular VM, which was one of the aims and a feature I was carrying
>> over from the previous version. If we do not care about this...
>> 
>>>
>>> If this is a bodge to avoid opening a perf_event per vCPU thread, then I
>>> completely disagree with the approach. This would be better handled in
>>> userspace by discovering the set of threads and opening events for
>>> each.
>> 
>> ... then requiring userspace to invoke perf_event_open perf vCPU will
>> simplify this patch.
>> 
>> Marc, any objections?
>
> Not so far, but I'm curious to find out how you determine which thread
> is a vcpu, let alone a given vcpu.

I should've clarified in my reply that I wasn't looking to support the
third instance from Mark's examples above - "monitor all vCPUs on a
pCPU". I think it'll be quite expensive to figure out which threads from
a given pool are vCPUs.

For the other instances, we only need to find the vCPU for a given
pid. Userspace will hand us a pid that needs to be checked against vCPUs
to establish that it is a valid vCPU pid (here I was looking to use
kvm_vcpu->pid and kvm->pid introduced in Patch 2).

This will happen when setting up the event and the vCPU can be cached
for later use.

>
> Thanks,
>
> 	M.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 13:45         ` Will Deacon
@ 2017-01-18 14:58           ` Punit Agrawal
  0 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-18 14:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: kvm, Marc Zyngier, linux-kernel, Steven Rostedt, Peter Zijlstra,
	kvmarm, linux-arm-kernel

Will Deacon <will.deacon@arm.com> writes:

> On Wed, Jan 18, 2017 at 01:01:40PM +0000, Punit Agrawal wrote:
>> Mark Rutland <mark.rutland@arm.com> writes:
>> 
>> > On Wed, Jan 18, 2017 at 11:21:21AM +0000, Marc Zyngier wrote:
>> >> On 10/01/17 11:38, Punit Agrawal wrote:
>> >> > +#define VM_MASK	GENMASK_ULL(31, 0)
>> >> > +#define EVENT_MASK	GENMASK_ULL(32, 39)
>> >> > +#define EVENT_SHIFT	(32)
>> >> > +
>> >> > +#define to_pid(cfg)	((cfg) & VM_MASK)
>> >> > +#define to_event(cfg)	(((cfg) & EVENT_MASK) >> EVENT_SHIFT)
>> >> > +
>> >> > +PMU_FORMAT_ATTR(vm, "config:0-31");
>> >> > +PMU_FORMAT_ATTR(event, "config:32-39");
>> >> 
>> >> I'm a bit confused by these. Can't you get the PID of the VM you're
>> >> tracing directly from perf, without having to encode things?
>> 
>> With perf attached to a PID, the event gets scheduled out when the task
>> is context switched. As the PID of the controlling process was used,
>> none of the vCPU events were counted.
>
> So it sounds like userspace needs to deal with this by attaching to the PIDs
> of the vCPUs. Given that perf kvm seems to have knowledge of vCPUs, it would
> be nice to know why that logic isn't reusable here. Take a look in
> tools/perf/builtin-kvm.c and if it's not up to the job, then perhaps it can
> be improved.

Thanks for the pointer. I'll have a play with perf kvm to see if it can
be leveraged here.

>
> Will
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 14:51           ` Punit Agrawal
@ 2017-01-18 15:17             ` Mark Rutland
  2017-01-18 16:17               ` Punit Agrawal
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Rutland @ 2017-01-18 15:17 UTC (permalink / raw)
  To: Punit Agrawal
  Cc: Marc Zyngier, kvm, Peter Zijlstra, Will Deacon, linux-kernel,
	Steven Rostedt, kvmarm, linux-arm-kernel

On Wed, Jan 18, 2017 at 02:51:31PM +0000, Punit Agrawal wrote:
> I should've clarified in my reply that I wasn't looking to support the
> third instance from Mark's examples above - "monitor all vCPUs on a
> pCPU". I think it'll be quite expensive to figure out which threads from
> a given pool are vCPUs.

I'm not sure I follow why you would need to do that?

In that case, we'd open a CPU-bound perf event for the pCPU, which would
get installed in the CPU context immediately. It would be present for
all tasks.

Given it's present for all tasks, we don't need to figure out which
happen to have vCPUs. The !vCPU tasks simply shouldn't trigger events.

Am I missing something?

> For the other instances, we only need to find the vCPU for a given
> pid. Userspace will hand us a pid that needs to be checked against vCPUs
> to establish that it is a valid vCPU pid (here I was looking to use
> kvm_vcpu->pid and kvm->pid introduced in Patch 2).

Thinking about this further, a pid is not a unique identifier for either
a vCPU or a VM.

A single task (which has a single pid), could own multiple VMs, each
with multiple vCPUs. A thread pool (with several pids) could share those
arbitrarily. So we need VM and vCPU IDs which are distinct from pids or
tids.

I see that struct kvm_vcpu has a vcpu_id (which from a glance appears to
be local to the kvm instance). It's not clear to me if a kvm instance
could be shared by multiple processes, or if we can get away with a
process-local ID.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 15:17             ` Mark Rutland
@ 2017-01-18 16:17               ` Punit Agrawal
  2017-01-18 18:05                 ` Mark Rutland
  0 siblings, 1 reply; 23+ messages in thread
From: Punit Agrawal @ 2017-01-18 16:17 UTC (permalink / raw)
  To: Mark Rutland
  Cc: kvm, Marc Zyngier, Will Deacon, linux-kernel, Steven Rostedt,
	Peter Zijlstra, kvmarm, linux-arm-kernel

Mark Rutland <mark.rutland@arm.com> writes:

> On Wed, Jan 18, 2017 at 02:51:31PM +0000, Punit Agrawal wrote:
>> I should've clarified in my reply that I wasn't looking to support the
>> third instance from Mark's examples above - "monitor all vCPUs on a
>> pCPU". I think it'll be quite expensive to figure out which threads from
>> a given pool are vCPUs.
>
> I'm not sure I follow why you would need to do that?
>
> In that case, we'd open a CPU-bound perf event for the pCPU, which would
> get installed in the CPU context immediately. It would be present for
> all tasks.
>
> Given it's present for all tasks, we don't need to figure out which
> happen to have vCPUs. The !vCPU tasks simply shouldn't trigger events.
>
> Am I missing something?

When enabling a CPU-bound event for pCPU, we'd have to enable trapping
of TLB operations for the vCPUs running on pCPU. Have a look at Patch
7/9.

Also, we'd have to enable/disable trapping when tasks are migrated
between pCPUs.

>
>> For the other instances, we only need to find the vCPU for a given
>> pid. Userspace will hand us a pid that needs to be checked against vCPUs
>> to establish that it is a valid vCPU pid (here I was looking to use
>> kvm_vcpu->pid and kvm->pid introduced in Patch 2).
>
> Thinking about this further, a pid is not a unique identifier for either
> a vCPU or a VM.
>
> A single task (which has a single pid), could own multiple VMs, each
> with multiple vCPUs. A thread pool (with several pids) could share those
> arbitrarily. So we need VM and vCPU IDs which are distinct from pids or
> tids.
>
> I see that struct kvm_vcpu has a vcpu_id (which from a glance appears to
> be local to the kvm instance). It's not clear to me if a kvm instance
> could be shared by multiple processes, or if we can get away with a
> process-local ID.

So far I've assumed that a VM pid is immutable. If that doesn't hold
then we need to think of another mechanism to refer to a VM from
userspace.

>
> Thanks, Mark.  _______________________________________________ kvmarm
>mailing list kvmarm@lists.cs.columbia.edu
>https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 16:17               ` Punit Agrawal
@ 2017-01-18 18:05                 ` Mark Rutland
  2017-01-19 16:42                   ` Christoffer Dall
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Rutland @ 2017-01-18 18:05 UTC (permalink / raw)
  To: Punit Agrawal
  Cc: kvm, Marc Zyngier, Will Deacon, linux-kernel, Steven Rostedt,
	Peter Zijlstra, kvmarm, linux-arm-kernel

On Wed, Jan 18, 2017 at 04:17:18PM +0000, Punit Agrawal wrote:
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On Wed, Jan 18, 2017 at 02:51:31PM +0000, Punit Agrawal wrote:
> >> I should've clarified in my reply that I wasn't looking to support the
> >> third instance from Mark's examples above - "monitor all vCPUs on a
> >> pCPU". I think it'll be quite expensive to figure out which threads from
> >> a given pool are vCPUs.
> >
> > I'm not sure I follow why you would need to do that?
> >
> > In that case, we'd open a CPU-bound perf event for the pCPU, which would
> > get installed in the CPU context immediately. It would be present for
> > all tasks.
> >
> > Given it's present for all tasks, we don't need to figure out which
> > happen to have vCPUs. The !vCPU tasks simply shouldn't trigger events.
> >
> > Am I missing something?
> 
> When enabling a CPU-bound event for pCPU, we'd have to enable trapping
> of TLB operations for the vCPUs running on pCPU. Have a look at Patch
> 7/9.
> 
> Also, we'd have to enable/disable trapping when tasks are migrated
> between pCPUs.

Ah, so we can't configure the trap and leave it active, since it'll
affect the host.

We could have a per-cpu flag, and a hook into vcpu_run, but that's also
gnarly.

I'll have a think.

> So far I've assumed that a VM pid is immutable. If that doesn't hold
> then we need to think of another mechanism to refer to a VM from
> userspace.

Even if we can't migrate the VM between processes (i.e. it's immutable),
it's still not unique within a process, so I'm fairly sure we need
another mechanism (even if we get away with the common case today).

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-18 18:05                 ` Mark Rutland
@ 2017-01-19 16:42                   ` Christoffer Dall
  2017-01-23 11:21                     ` Punit Agrawal
  0 siblings, 1 reply; 23+ messages in thread
From: Christoffer Dall @ 2017-01-19 16:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Punit Agrawal, kvm, Marc Zyngier, Will Deacon, linux-kernel,
	Steven Rostedt, Peter Zijlstra, kvmarm, linux-arm-kernel

On Wed, Jan 18, 2017 at 06:05:46PM +0000, Mark Rutland wrote:
> On Wed, Jan 18, 2017 at 04:17:18PM +0000, Punit Agrawal wrote:
> > Mark Rutland <mark.rutland@arm.com> writes:
> > 
> > > On Wed, Jan 18, 2017 at 02:51:31PM +0000, Punit Agrawal wrote:
> > >> I should've clarified in my reply that I wasn't looking to support the
> > >> third instance from Mark's examples above - "monitor all vCPUs on a
> > >> pCPU". I think it'll be quite expensive to figure out which threads from
> > >> a given pool are vCPUs.
> > >
> > > I'm not sure I follow why you would need to do that?
> > >
> > > In that case, we'd open a CPU-bound perf event for the pCPU, which would
> > > get installed in the CPU context immediately. It would be present for
> > > all tasks.
> > >
> > > Given it's present for all tasks, we don't need to figure out which
> > > happen to have vCPUs. The !vCPU tasks simply shouldn't trigger events.
> > >
> > > Am I missing something?
> > 
> > When enabling a CPU-bound event for pCPU, we'd have to enable trapping
> > of TLB operations for the vCPUs running on pCPU. Have a look at Patch
> > 7/9.
> > 
> > Also, we'd have to enable/disable trapping when tasks are migrated
> > between pCPUs.
> 
> Ah, so we can't configure the trap and leave it active, since it'll
> affect the host.
> 
> We could have a per-cpu flag, and a hook into vcpu_run, but that's also
> gnarly.
> 
> I'll have a think.
> 
> > So far I've assumed that a VM pid is immutable. If that doesn't hold
> > then we need to think of another mechanism to refer to a VM from
> > userspace.
> 
> Even if we can't migrate the VM between processes (i.e. it's immutable),
> it's still not unique within a process, so I'm fairly sure we need
> another mechanism (even if we get away with the common case today).
> 
I don't understand what the requirements here are exactly but the KVM
API documentation says:

  In general file descriptors can be migrated among processes by means
  of fork() and the SCM_RIGHTS facility of unix domain socket.  These
  kinds of tricks are explicitly not supported by kvm.  While they will
  not cause harm to the host, their actual behavior is not guaranteed by
  the API.  The only supported use is one virtual machine per process,
  and one vcpu per thread.

So this code should maintain those semantics and it's fair to assume the
thread group leader of a given VM stays the same, but the code must not
rely on this fact for safe operations.

I also don't see why a process couldn't open multiple VMs; however
messy that may be, it appears possible to me.

-Christoffer

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection
  2017-01-19 16:42                   ` Christoffer Dall
@ 2017-01-23 11:21                     ` Punit Agrawal
  0 siblings, 0 replies; 23+ messages in thread
From: Punit Agrawal @ 2017-01-23 11:21 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, kvm, Marc Zyngier, Will Deacon, linux-kernel,
	Steven Rostedt, Peter Zijlstra, kvmarm, linux-arm-kernel

Hi Christoffer,

Christoffer Dall <christoffer.dall@linaro.org> writes:

> On Wed, Jan 18, 2017 at 06:05:46PM +0000, Mark Rutland wrote:
>> On Wed, Jan 18, 2017 at 04:17:18PM +0000, Punit Agrawal wrote:
>> > Mark Rutland <mark.rutland@arm.com> writes:
>> > 
>> > > On Wed, Jan 18, 2017 at 02:51:31PM +0000, Punit Agrawal wrote:
>> > >> I should've clarified in my reply that I wasn't looking to support the
>> > >> third instance from Mark's examples above - "monitor all vCPUs on a
>> > >> pCPU". I think it'll be quite expensive to figure out which threads from
>> > >> a given pool are vCPUs.
>> > >
>> > > I'm not sure I follow why you would need to do that?
>> > >
>> > > In that case, we'd open a CPU-bound perf event for the pCPU, which would
>> > > get installed in the CPU context immediately. It would be present for
>> > > all tasks.
>> > >
>> > > Given it's present for all tasks, we don't need to figure out which
>> > > happen to have vCPUs. The !vCPU tasks simply shouldn't trigger events.
>> > >
>> > > Am I missing something?
>> > 
>> > When enabling a CPU-bound event for pCPU, we'd have to enable trapping
>> > of TLB operations for the vCPUs running on pCPU. Have a look at Patch
>> > 7/9.
>> > 
>> > Also, we'd have to enable/disable trapping when tasks are migrated
>> > between pCPUs.
>> 
>> Ah, so we can't configure the trap and leave it active, since it'll
>> affect the host.
>> 
>> We could have a per-cpu flag, and a hook into vcpu_run, but that's also
>> gnarly.
>> 
>> I'll have a think.
>> 
>> > So far I've assumed that a VM pid is immutable. If that doesn't hold
>> > then we need to think of another mechanism to refer to a VM from
>> > userspace.
>> 
>> Even if we can't migrate the VM between processes (i.e. it's immutable),
>> it's still not unique within a process, so I'm fairly sure we need
>> another mechanism (even if we get away with the common case today).
>> 
> I don't understand what the requirements here are exactly but the KVM
> API documentation says:
>
>   In general file descriptors can be migrated among processes by means
>   of fork() and the SCM_RIGHTS facility of unix domain socket.  These
>   kinds of tricks are explicitly not supported by kvm.  While they will
>   not cause harm to the host, their actual behavior is not guaranteed by
>   the API.  The only supported use is one virtual machine per process,
>   and one vcpu per thread.
>
> So this code should maintain those semantics and it's fair to assume the
> thread group leader of a given VM stays the same, but the code must not
> rely on this fact for safe operations.

Thanks for clarifying. The current version passes muster on these
assumptions, but I'll have to take a closer look to convince myself of
the safety.

By moving to vCPU pids in the next version, things should further
improve in this regard.

>
> I also don't see why a process couldn't open multiple VMs; however
> messy that may be, it appears possible to me.

I imagine there is an implicit reliance on the VMM to handle any
resulting fallout if it chooses to do this.

>
> -Christoffer
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-01-23 11:21 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-10 11:38 [PATCH v3 0/9] Add support for monitoring guest TLB operations Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 1/9] arm64/kvm: hyp: tlb: use __tlbi() helper Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 2/9] KVM: Track the pid of the VM process Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 3/9] KVM: Add event to trace tlb invalidations Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 4/9] arm: KVM: Handle trappable TLB instructions Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 5/9] arm64: " Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 6/9] kvm: arm/arm64: Add host pmu to support VM introspection Punit Agrawal
2017-01-18 11:21   ` Marc Zyngier
2017-01-18 11:35     ` Mark Rutland
2017-01-18 13:01       ` Punit Agrawal
2017-01-18 13:18         ` Marc Zyngier
2017-01-18 14:51           ` Punit Agrawal
2017-01-18 15:17             ` Mark Rutland
2017-01-18 16:17               ` Punit Agrawal
2017-01-18 18:05                 ` Mark Rutland
2017-01-19 16:42                   ` Christoffer Dall
2017-01-23 11:21                     ` Punit Agrawal
2017-01-18 13:45         ` Will Deacon
2017-01-18 14:58           ` Punit Agrawal
2017-01-18 13:06     ` Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 7/9] kvm: host_pmu: Add support for tracking guest TLB operations Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 8/9] arm: KVM: Enable support for host pmu Punit Agrawal
2017-01-10 11:38 ` [PATCH v3 9/9] arm64: KVM: Enable support for the " Punit Agrawal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).