From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: Enhance perf to support KVM Date: Fri, 26 Feb 2010 11:35:45 +0100 Message-ID: <20100226103545.GA7463@elte.hu> References: <1267068445.1726.25.camel@localhost> <1267089644.12790.74.camel@laptop> <1267152599.1726.76.camel@localhost> <20100226090147.GH15885@elte.hu> <4B879A2F.50203@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Zhang, Yanmin" , Peter Zijlstra , ming.m.lin@intel.com, sheng.yang@intel.com, Jes Sorensen , KVM General , Zachary Amsden , Gleb Natapov , Arnaldo Carvalho de Melo , Fr??d??ric Weisbecker , Thomas Gleixner , "H. Peter Anvin" , Peter Zijlstra , Arjan van de Ven To: Avi Kivity Return-path: Received: from mx3.mail.elte.hu ([157.181.1.138]:47251 "EHLO mx3.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933334Ab0BZKgB (ORCPT ); Fri, 26 Feb 2010 05:36:01 -0500 Content-Disposition: inline In-Reply-To: <4B879A2F.50203@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: * Avi Kivity wrote: > On 02/26/2010 11:01 AM, Ingo Molnar wrote: > >* Zhang, Yanmin wrote: > > > >>2) We couldn't get guest os kernel/user stack data in an easy way, so we > >>might not support callchain feature of tool perf. A work around is KVM > >>copies kernel stack data out, so we could at least support guest os kernel > >>callchain. > >If the guest is Linux, KVM can get all the info we need. > > > >While the PMU event itself might trigger in an NMI (where we cannot access > >most of KVM's data structures safely), for this specific case of KVM > >instrumentation we can delay the processing to a more appropriate time - in > >fact we can do it in the KVM thread itself. > > The nmi will be a synchronous event: it happens in guest context, > and we program the hardware to intercept nmis, so we just get an > exit telling us that an nmi has happened. > > (would also be interesting to allow the guest to process the nmi > directly in some scenarios, though that would require that there be > no nmi sources on the host). > > >We can do that because we just triggered a VM exit, so the VM state is for all > >purposes frozen (as far as this virtual CPU goes). > > Yes. > > >Which egives us plenty of time and opportunity to piggy back to the KVM > >thread, look up the guest stack, process/fill the MMU cache as we walk the > >guest page tables, etc. etc. > > > >It would need some minimal callback facility towards KVM, triggered by a perf > >event PMI. > > Since the event is synchronous and kvm is aware of it we don't need > a callback; kvm can call directly into perf with all the > information. Yes - it's still a "callback" in the abstract sense. Much of it already all existing. > >One additional step needed is to get symbol information from the guest, and to > >integrate it into the symbol cache on the host side in ~/.debug. We already > >support cross-arch symbols and 'perf archive', so the basic facilities are > >there for that. So you can profile on 32-bit PA-RISC and type 'perf report' on > >64-bit x86 and get all the right info. > > > >For this to work across a guest, a gateway is needed towards the guest. > >There's several ways to achieve this. The most practical would be two steps: > > > > - a user-space facility to access guest images/libraries. (say via ssh, or > > just a plain TCP port) This would be useful for general 'remote profiling' > > sessions as well, so it's not KVM specific - it would be useful for remote > > debugging. > > > > - The guest /proc/kallsyms (and vmlinux) could be accessed via that channel > > as well. > > > >(Note that this is purely for guest symbol space access - all the profiling > >data itself comes via the host kernel.) > > > >In theory we could build some sort of 'symbol server' facility into the > >kernel, which could be enabled in guest kernels too - but i suspect existing, > >user-space transports go most of the way already. > > There is also vmchannel aka virtio-serial, a guest-to-host communication > channel. Basically what is needed is plain filesystem access - properly privileged. So doing this via a vmchannel would be nice, but for the symbol extraction it would be a glorified NFS server in essence. Do you have (or plan) any turn-key 'access to all files of the guest' kind of guest-transparent facility that could be used for such purposes? That would have various advantages over a traditional explicit file server approach: - it would not contaminate the guest port space - no guest side configuration needed (the various oprofile remote daemons always sucked as they needed extra setup) - it might even be used with a guest that does no networking - if done fully in the kernel it could be done with a fully 'unaware' guest, etc. Thanks, Ingo