From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: QEMU PIC indirection patch for in-kernel APIC work Date: Thu, 05 Apr 2007 18:01:35 +0300 Message-ID: <46150F4F.4030505@qumranet.com> References: <0E6FE5D295DE5B4B8D9070C26A227987F2444B@pdsmsx411.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org To: "Dong, Eddie" Return-path: In-Reply-To: <0E6FE5D295DE5B4B8D9070C26A227987F2444B-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Dong, Eddie wrote: > Avi Kivity wrote: > >>> With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux >>> kernel, KB gets 14% gain. We also did a shared PIC model which share >>> PIC state among Qemu & VMM with less LOC in VMM, it can get >>> similar performance gain (5.8% in my test). >>> BTW, at that time, PIT is in VMM already. >>> >>> >> I expect that the gain in kvm will be smaller. Xen has to schedule >> dom0 to process the event channel (possibly on another cpu), dom0 has >> to schedule qemu-dm (again, possibly on another cpu), qemu does its >> thing, and then Xen has to schedule domU again. With kvm, we are >> always on the same cpu, and the only overhead is the system call, >> which is a few hundred nanoseconds. I expect with current hardware >> that it will be negligible (as a vmexit is measured in microseconds), >> but to become measurable as hardware improves. >> > Yes very possible. > We can take a quick mesurement to see how many cycles are spent in a > dummy > I/O emulation in KVM/Qemu. In Xen, one of my old P4 3.8GHZ platform > takes > about 50-60K cycles. We can see how many is it in KVM. > BTW, today Linux kernel is no longer 1000HZ :-) > thx,eddie > There's some (old) data here: http://virt.kernelnewbies.org/KVM/Performance showing pio latency of ~5600 cycles. Note that this is on AMD, which takes less cycles to switch than the P4, but on the other hand, we still do a save/restore of the fpu state on every exit, so we can speed it up even more. -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV