All of lore.kernel.org
 help / color / mirror / Atom feed
* slow guest performance with build load, looking for ideas
@ 2009-06-12 21:04 Erik Jacobson
  2009-06-14  9:33 ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Erik Jacobson @ 2009-06-12 21:04 UTC (permalink / raw)
  To: kvm

We have been trying to test qemu-kvm virtual machines under an IO load.
The IO load is quite simple: A timed build of the linux kernel and modules.
I have found that virtual machines take more than twice as long to do this
build as the host.  It doesn't seem to matter if I use virtio or not,  Using
the same device and same filesystem, the host is more than twice as fast.

We're hoping that we can get some advice on how to address this issue.  If
there are any options I should add for our testing, we'd appreciate it.  I'm
also game to try development bits to see if they make a difference.  If it
turns out "that is just the way it is right now", we'd like to know that
too.

For these tests, I used Fedora 11 as the virtualization server.  I did this
because it has recent bits.  I experimented with SLES11 and Fedora11 guests.

In general, I used virt-manager to do the setup and launching.  So the
qemu-kvm command lines are based on that (and this explains why they are
a bit long).  I then modified the qemu-kvm command line to perform other
variations of the test.  Example command lines can be found at the end of
this message.

I performed tests on two different systems to be sure it isn't related to
specific hardware.

------------------
------------------
kernel/sw versions
------------------
------------------
virt host (always fedora 11): 2.6.29.4-167.fc11.x86_64
guest (same as above for fedora 11 guests, SLES 11 GA kernel for SLES guests)
qemu-kvm: qemu-kvm-0.10.4-4.fc11.x86_64
libvirt: libvirt-0.6.2-11.fc11.x86_64

----------------
----------------
Test description
----------------
----------------
The test I ran in different scenarios was always the same:
Running a build of the linux kernel and modules and timing the result.
I decided on this test because we tend to make build servers out of new
hardware and software releases to help put them through their paces.

In all cases, the work area used was on a device separate from the root.
A disk device was always feed for qemu-kvm to use entirely.  The roots were
disk images but the workarea was always a fully imported device.  One exception
were a couple test runs using nfs from the host mounted on the guest.

The test build filesystem was always ext3 (except for the case of
nfs-from-host, where it was ext3 on the host and nfs on the guest).  The
filesystem was simply mounted by hand with the mount command and no special
options.

The run would look something like this... Setup:
 $ cd /work/erikj/linux-2.6.29.4
 $ cp arch/x86/configs/x86_64_defconfig .config
 $ make oldconfig
 $ make -j12  [ but not counted in the test results ]

The part of the test repeated for each run
 $ make -j12 clean
 $ time (make -j12 && make -j12 modules)   # represents posted results

The results from the above timing are what are pasted in the results.

------------------
------------------
Testing on host 1:
------------------
------------------
Host distro: Fedora 11
Guest distro: Fedora 11 and SLES11
8 vcpus provided to guest, 2048 megabytes of memory

Virtualization host system information:
System type: SGI Altix XE 310, Supermicro X7DBT mainboard
Memory: 4 GB, DDR2, 667 MHz
CPUs: 8 core, Xeon 2.33GHz, 4096 KB cache size
disk 1 (root, 50gb part): HDS725050KLA360  (500gb, 7200 rpm, SATA, 8.5ms seek)
disk 2 (work area): HDT722525DLA380 (250GB, 7200 rpm, SATA, 8.5ms seek)

fedora11 host, no guest (baseline)
-----------------------
  -> real  10m38.116s  43m25.553s  11m29.004s

fedora11 host, sles11 guest
---------------------------
 virtio, work area imported as a full device (not nfs)
  -> real  26m2.004s  user  99m29.177s  sys   30m31.586s

 virtio for root but workarea nfs-mounted from host
  -> real  68m37.306s  user  76m0.445s  sys   67m17.888s

fedora11 host, fedora11 guest
-----------------------------
 IDE emulation, no virtio, workarea device fully imported to guest for workara
  -> real  29m47.249s  user  59m1.583s  sys   41m34.281s

 Same as above, but with qemu cache=none parameter
  -> real  26m1.668s  user  66m14.812s  sys   46m21.366s

 virtio devices, device fully imported to guest for workarea, cache=none
  -> real  23m28.397s  user  68m27.730s  sys   47m50.256s

 Didn't do NFS testing in this scenario.


------------------
------------------
Testing on host 2:
------------------
------------------
Host distro: Fedora 11
Guest distro: Fedora 11
8 vcpus provided to guest, 4096 megabytes of memory

System type: SGI Altix XE XE250, Supermicro X7DWN+ main board
Memory:8 1gb DDR2 667MHz DIMMs
CPUs: 8 Intel Xeon X5460, 3.16 GHz, 6144 KB cache
disk1: LSI MegaRAID volume, 292gb, but root slice used is only 25gb
disk2: LSI MegaRAID volume, 100gb, full space used for build work area

fedora11 host, no guest (baseline)
-----------------------
 -> real  6m25.008s   user  30m54.697s   sys   8m17.359s

fedora11 host, fedora11 guest
-----------------------------
  virtio, no cache= parameter supplied to qemu:
  -> real  19m46.770s   user  52m33.523s   sys   42m55.202s

  virtio guest, qemu cache=none parameter supplied:
  -> real  18m17.690s   user  51m3.223s   sys   41m22.047s

  IDE emulation , no cache parameter:
  -> real  22m41.472s   user  44m48.190s   sys   38m3.750s

  IDE emulation, qemu cache=none parameter supplied:
  -> real  19m53.111s   user  48m48.342s   sys  40m19.469s

---------------------------------------------
---------------------------------------------
Example qemu-kvm command lines for the tests:
---------------------------------------------
---------------------------------------------
virtio, no cache= parameter supplied to qemu:
Note: This is is also exactly the command that libvirt ran

/usr/bin/qemu-kvm -S -M pc -m 4096 -smp 8 -name f11-test \
  -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty \
  -pidfile /var/run/libvirt/qemu//f11-test.pid -boot c \
  -drive file=,if=ide,media=cdrom,index=2 \
  -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on \
  -drive file=/dev/sdb,if=virtio,index=1 \
  -net nic,macaddr=54:52:00:46:48:0e,vlan=0,model=virtio \
  -net tap,fd=20,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb \
  -usbdevice tablet -vnc 127.0.0.1:0 -soundhw es1370


virtio guest, qemu cache=none parameter supplied:
Note: Command modified so that running qemu by hand worked including setting
up a tun interface for the network bridge to work right outside of libvirt.
True with the following command lines too.

/usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test \
  -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty \
  -pidfile /var/run/libvirt/qemu//f11-test.pid -boot c \
  -drive file=,if=ide,media=cdrom,index=2 \
  -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on \
  -drive file=/dev/sdb,if=virtio,cache=none,index=1 \
  -net nic,macaddr=54:52:00:46:48:0e,vlan=0,model=virtio \
  -net tap,script=no,vlan=0,ifname=tap0 -serial pty -parallel none -usb \
  -usbdevice tablet -soundhw es1370

IDE emulation, no cache parameter:
/usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test \
  -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty \
  -pidfile /var/run/libvirt/qemu//f11-test.pid -boot c \
  -drive file=,if=ide,media=cdrom,index=2 \
  -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on \
  -drive file=/dev/sdb,if=ide,index=1 \
  -net nic,macaddr=54:52:00:46:48:0e,vlan=0,model=virtio \
  -net tap,script=no,vlan=0,ifname=tap0 -serial pty -parallel none -usb \
  -usbdevice tablet -soundhw es1370

IDE emulation, qemu cache=none parameter supplied:
/usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test \
  -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty \
  -pidfile /var/run/libvirt/qemu//f11-test.pid -boot c \
  -drive file=,if=ide,media=cdrom,index=2 \
  -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on \
  -drive file=/dev/sdb,if=ide,cache=none,index=1 \
  -net nic,macaddr=54:52:00:46:48:0e,vlan=0,model=virtio \
  -net tap,script=no,vlan=0,ifname=tap0 -serial pty -parallel none -usb \
  -usbdevice tablet -soundhw es1370

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-12 21:04 slow guest performance with build load, looking for ideas Erik Jacobson
@ 2009-06-14  9:33 ` Avi Kivity
  2009-06-15 14:15   ` Erik Jacobson
                     ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Avi Kivity @ 2009-06-14  9:33 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: kvm

Erik Jacobson wrote:
> We have been trying to test qemu-kvm virtual machines under an IO load.
> The IO load is quite simple: A timed build of the linux kernel and modules.
> I have found that virtual machines take more than twice as long to do this
> build as the host.  It doesn't seem to matter if I use virtio or not,  Using
> the same device and same filesystem, the host is more than twice as fast.
>
> We're hoping that we can get some advice on how to address this issue.  If
> there are any options I should add for our testing, we'd appreciate it.  I'm
> also game to try development bits to see if they make a difference.  If it
> turns out "that is just the way it is right now", we'd like to know that
> too.
>
> For these tests, I used Fedora 11 as the virtualization server.  I did this
> because it has recent bits.  I experimented with SLES11 and Fedora11 guests.
>
> In general, I used virt-manager to do the setup and launching.  So the
> qemu-kvm command lines are based on that (and this explains why they are
> a bit long).  I then modified the qemu-kvm command line to perform other
> variations of the test.  Example command lines can be found at the end of
> this message.
>
> I performed tests on two different systems to be sure it isn't related to
> specific hardware.
>   

What is the host cpu type?  On pre-Nehalem/Barcelona processors kvm has 
poor scalability in mmu intensive workloads like kernel builds.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-14  9:33 ` Avi Kivity
@ 2009-06-15 14:15   ` Erik Jacobson
  2009-06-15 14:24     ` Avi Kivity
  2009-06-18 23:07   ` Erik Jacobson
  2009-07-03 10:41   ` Matty
  2 siblings, 1 reply; 25+ messages in thread
From: Erik Jacobson @ 2009-06-15 14:15 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, kvm

> What is the host cpu type?  On pre-Nehalem/Barcelona processors kvm has  
> poor scalability in mmu intensive workloads like kernel builds.

Thanks for getting back to me.

Today is pretty booked but I'm going to go find a Nehalem system and try to
run similar tests to compare.  I'll post my results to this thread.

So if I understand what you're saying: best not to use kvm guests for build
servers with pre-Nehalem processors.

Both systems I used were pre-Nehalem.  Here is a cpuinfo snip from both
systems I tested on:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU           E5345  @ 2.33GHz
stepping	: 7
cpu MHz		: 2327.500
cache size	: 4096 KB
physical id	: 1
siblings	: 4
core id		: 3
cpu cores	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr dca lahf_lm
bogomips	: 4655.14
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:


and


processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Xeon(R) CPU           X5460  @ 3.16GHz
stepping	: 6
cpu MHz		: 3158.307
cache size	: 6144 KB
physical id	: 1
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips	: 6317.51
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual
power management:

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-15 14:15   ` Erik Jacobson
@ 2009-06-15 14:24     ` Avi Kivity
  2009-06-15 15:25       ` Michael Tokarev
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-06-15 14:24 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: kvm

On 06/15/2009 05:15 PM, Erik Jacobson wrote:
>> What is the host cpu type?  On pre-Nehalem/Barcelona processors kvm has
>> poor scalability in mmu intensive workloads like kernel builds.
>>      
>
> Thanks for getting back to me.
>
> Today is pretty booked but I'm going to go find a Nehalem system and try to
> run similar tests to compare.  I'll post my results to this thread.
>
> So if I understand what you're saying: best not to use kvm guests for build
> servers with pre-Nehalem processors.
>    

pre-Nehalem / pre-Barcelona, > 4 vcpus, yes.

> Both systems I used were pre-Nehalem.  Here is a cpuinfo snip from both
> systems I tested on:
>    

Yes, so I expect you're seeing contention on kvm->mmu_lock.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-15 14:24     ` Avi Kivity
@ 2009-06-15 15:25       ` Michael Tokarev
  2009-06-15 15:27         ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Tokarev @ 2009-06-15 15:25 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, kvm

Avi Kivity wrote:
> On 06/15/2009 05:15 PM, Erik Jacobson wrote:
[]
>> So if I understand what you're saying: best not to use kvm guests for  build
>> servers with pre-Nehalem processors.
> 
> pre-Nehalem / pre-Barcelona, > 4 vcpus, yes.

How about 2 vcpus, and how about AMD processors ?

Thanks

/mjt

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-15 15:25       ` Michael Tokarev
@ 2009-06-15 15:27         ` Avi Kivity
  2009-06-16  7:03           ` Michael Tokarev
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-06-15 15:27 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: Erik Jacobson, kvm

On 06/15/2009 06:25 PM, Michael Tokarev wrote:
> Avi Kivity wrote:
>> On 06/15/2009 05:15 PM, Erik Jacobson wrote:
> []
>>> So if I understand what you're saying: best not to use kvm guests 
>>> for  build
>>> servers with pre-Nehalem processors.
>>
>> pre-Nehalem / pre-Barcelona, > 4 vcpus, yes.
>
> How about 2 vcpus, and how about AMD processors ?

2 vcpus (or 4) should be fine.  AMD processors (Barcelona+) would be 
good for any number of vcpus.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-15 15:27         ` Avi Kivity
@ 2009-06-16  7:03           ` Michael Tokarev
  2009-06-16  8:07             ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Tokarev @ 2009-06-16  7:03 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, kvm

Avi Kivity wrote:
> On 06/15/2009 06:25 PM, Michael Tokarev wrote:
>> Avi Kivity wrote:
>>> On 06/15/2009 05:15 PM, Erik Jacobson wrote:
>> []
>>>> So if I understand what you're saying: best not to use kvm guests for build
>>>> servers with pre-Nehalem processors.
>>>
>>> pre-Nehalem / pre-Barcelona, > 4 vcpus, yes.
>>
>> How about 2 vcpus, and how about AMD processors ?
> 
> 2 vcpus (or 4) should be fine.  AMD processors (Barcelona+) would be 
> good for any number of vcpus.

Hmm.. that's sorta good (not so good for owners of most
Intel CPUs -- Nehalem just started its life).  But still
confusing.  Namely, 2..4 vcpus per GUEST or HOST -- for
the ore-Nehalem ones? :)

Thanks!

/mjt

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-16  7:03           ` Michael Tokarev
@ 2009-06-16  8:07             ` Avi Kivity
  0 siblings, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2009-06-16  8:07 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: Erik Jacobson, kvm

On 06/16/2009 10:03 AM, Michael Tokarev wrote:
>>>>> So if I understand what you're saying: best not to use kvm guests 
>>>>> for build
>>>>> servers with pre-Nehalem processors.
>>>>
>>>> pre-Nehalem / pre-Barcelona, > 4 vcpus, yes.
>>>
>>> How about 2 vcpus, and how about AMD processors ?
>>
>> 2 vcpus (or 4) should be fine.  AMD processors (Barcelona+) would be 
>> good for any number of vcpus.
> []
>
> Hmm.. that's sorta good (not so good for owners of most
> Intel CPUs -- Nehalem just started its life).  But still
> confusing.  Namely, 2..4 vcpus per GUEST or HOST -- for
> the ore-Nehalem ones? :)

4 vcpus per guest would be fine (even more should work, depending on 
workload).  Host will scale with any number of cpus.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-14  9:33 ` Avi Kivity
  2009-06-15 14:15   ` Erik Jacobson
@ 2009-06-18 23:07   ` Erik Jacobson
  2009-06-28 14:17     ` Avi Kivity
  2009-07-03 10:41   ` Matty
  2 siblings, 1 reply; 25+ messages in thread
From: Erik Jacobson @ 2009-06-18 23:07 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, kvm

Hello.  I'll top-post since the quoted text is just for reference.

Sorry the follow-up testing took so long.  We're very low on 5500/Nehalem
resources at the moment and I had to track down lots of stuff before
getting to the test.

I ran some tests on a 2-socket, 8-core system.  I wasn't pleased with the
results for a couple reasons.  One, the issue of it being twice as slow
as the host with no guest was still present.

However, in trying to make use of this system using Fedora 11, I ran in to
several issues not directly related to virtualization.  So these test runs
have that grain of salt.  Example issues...
 * Node ordering is not sequential (Ie /sys/devices/sysstem/node/node0 and
   node2, but no node 1).  This caused tools based on libvirt and friends
   to be unhappy.  I worked around this by using qemu-kvm by hand directly.
   we found an LKML posting to address this issue; I didn't check if it made
   it in yet.
 * All cores show up as being associated with the first node (node0) even
   though half should be associated with the 2nd node (still researching that
   some).
 * In some of the timing runs on this system, the "real time" reported by
   the time command was off by 10 to 11 times.  Issues were found in
   the messages file that seemed to relate to this including HUGE time
   adjustments by NTP and kernel hrtimer 'interrupt too slow' messages.
   This specific problem seems to be intermittent.
 * None of the above problems were observed in 8-core/2-socket non-5500/
   Nehalem systems.  Of course, 2-socket non-Nehalem systems do not have
   multiple nodes listed under /sys.
 * I lose access to the resource today but can try to beg and plead again
   some time next week if folks have ideas to try.  Let me know.

So those are the grains of salt.  I've found that, when doing the timing by
hand instead of using the time command, the build time seems to be around
10 to 12 minutes.  I'm not sure how trustworthy the output from the time
command are in these trials.  In any event, that's still more than double
for host alone with no guests.

System:
SGI XE270, 8-core, Xeon X5570 (Nehalem), Hyperthreading turned off
Supermicro model: X8DTN
Disk1: root disk 147GB ST3146855SS 15K 16MB cache SAS
Disk2: work area disk 500GB HDS725050KLA360  7200rpm 16MB cache SATA
Distro: Everything Fedora11+released updates
Memory: 8 gb in 2048 DDR3 1066 MHZ 18JSF25672PY-1G1D1 DIMMs

Only Fedora11 was used (host and guest where applicable).
The first timing weirdness was done on a F11 guest with no updates
applied.  I later applied the updates and the timings seemed to get
worse, although I don't trust the values any more.

F11+released updates has these versions:
kernel-2.6.29.4-167.fc11.x86_64
qemu-kvm-0.10.5-2.fc11.x86_64


Test, as before, was simply this for a kernel build.  The .config file has
plenty of modules configured.
time (make -j12 && make -j12 modules)



host only, no guest, baseline
-----------------------------
trial 1:
real	5m44.823s
user	28m45.725s
sys	5m46.633s

trial 2:
real	5m34.438s
user	28m14.347s
sys	5m41.597s


guest, 8 vcpu, 4096 mem, virtio, no cache param, disk device supplied in full
-----------------------------------------------------------------------------
trial 1:
real	125m5.995s
user	31m23.790s
sys	9m17.602s


trial 2 (changed to 7168 mb memory for the guest):
real	120m48.431s
user	14m38.967s
sys	6m12.437s


That's real strange...  The 'time' command is showing whacked out results.

I then watched a run by hand and counted it at about 10 minutes.  However,
this third run had the proper time!  So whatever the weirdness is, it doesn't
happen every time:

real	9m49.802s
user	24m46.009s
sys	8m10.349s

I decided this could be related to ntp running as I saw this in messages:
Jun 18 16:34:23 localhost ntpd[1916]: time reset -0.229209 s
Jun 18 16:34:23 localhost ntpd[1916]: kernel time sync status change 0001
Jun 18 16:40:17 localhost ntpd[1916]: synchronized to 128.162.244.1, stratum 2

and earlier:

Jun 18 16:19:09 localhost ntpd[1916]: synchronized to 128.162.244.1, stratum 2
Jun 18 16:19:09 localhost ntpd[1916]: time reset +6609.851122 s
Jun 18 16:23:39 localhost ntpd[1916]: synchronized to 128.162.244.1, stratum 2
Jun 18 16:24:04 localhost kernel: hrtimer: interrupt too slow, forcing clock min delta to 62725995 ns


I then installed all F11 updates in the guest and tried again (host had
updates all along).  I got these strange results, strange because of the
timing difference.  I didn't "watch a non-computer clock" for these.

Timing from that was:
trial 1:
real	16m10.337s
user	28m27.604s
sys	9m12.772s

trial 2:
real	11m45.934s
user	25m4.432s
sys	8m2.189s


Here is the qemu-kvm command line used.  The -m was for the first run was
4096, and it was 7168 for the other runs.

# /usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/foo/f11/Fedora-11-x86_64-DVD.iso,if=virtio,media=cdrom,index=2 -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on -drive file=/dev/sdb,if=virtio,index=1 -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -usbdevice tablet -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22


/proc/cpuinfo is pasted after the test results.




# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5865.69
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5865.76
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 4
initial apicid	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5823.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 6
initial apicid	: 6
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5865.76
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 4
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 1
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 16
initial apicid	: 16
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5865.80
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 5
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 1
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 18
initial apicid	: 18
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5865.80
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 6
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 1
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 20
initial apicid	: 20
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5865.80
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
stepping	: 5
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 1
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 22
initial apicid	: 22
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5865.79
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:



On Sun, Jun 14, 2009 at 12:33:06PM +0300, Avi Kivity wrote:
> Erik Jacobson wrote:
>> We have been trying to test qemu-kvm virtual machines under an IO load.
>> The IO load is quite simple: A timed build of the linux kernel and modules.
>> I have found that virtual machines take more than twice as long to do this
>> build as the host.  It doesn't seem to matter if I use virtio or not,  Using
>> the same device and same filesystem, the host is more than twice as fast.
>>
>> We're hoping that we can get some advice on how to address this issue.  If
>> there are any options I should add for our testing, we'd appreciate it.  I'm
>> also game to try development bits to see if they make a difference.  If it
>> turns out "that is just the way it is right now", we'd like to know that
>> too.
>>
>> For these tests, I used Fedora 11 as the virtualization server.  I did this
>> because it has recent bits.  I experimented with SLES11 and Fedora11 guests.
>>
>> In general, I used virt-manager to do the setup and launching.  So the
>> qemu-kvm command lines are based on that (and this explains why they are
>> a bit long).  I then modified the qemu-kvm command line to perform other
>> variations of the test.  Example command lines can be found at the end of
>> this message.
>>
>> I performed tests on two different systems to be sure it isn't related to
>> specific hardware.
>>   
>
> What is the host cpu type?  On pre-Nehalem/Barcelona processors kvm has  
> poor scalability in mmu intensive workloads like kernel builds.
>
> -- 
> error compiling committee.c: too many arguments to function
-- 
Erik Jacobson - Linux System Software - SGI - Eagan, Minnesota

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-18 23:07   ` Erik Jacobson
@ 2009-06-28 14:17     ` Avi Kivity
  2009-06-28 19:05       ` Erik Jacobson
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-06-28 14:17 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: kvm

On 06/19/2009 02:07 AM, Erik Jacobson wrote:
> Hello.  I'll top-post since the quoted text is just for reference.
>
> Sorry the follow-up testing took so long.  We're very low on 5500/Nehalem
> resources at the moment and I had to track down lots of stuff before
> getting to the test.
>
> I ran some tests on a 2-socket, 8-core system.  I wasn't pleased with the
> results for a couple reasons.  One, the issue of it being twice as slow
> as the host with no guest was still present.
>
> However, in trying to make use of this system using Fedora 11, I ran in to
> several issues not directly related to virtualization.  So these test runs
> have that grain of salt.  Example issues...
>
>    

<snip>

>   * In some of the timing runs on this system, the "real time" reported by
>     the time command was off by 10 to 11 times.  Issues were found in
>     the messages file that seemed to relate to this including HUGE time
>     adjustments by NTP and kernel hrtimer 'interrupt too slow' messages.
>     This specific problem seems to be intermittent.
>    

This is on the host? It can easily ruin your day.

> So those are the grains of salt.  I've found that, when doing the timing by
> hand instead of using the time command, the build time seems to be around
> 10 to 12 minutes.  I'm not sure how trustworthy the output from the time
> command are in these trials.  In any event, that's still more than double
> for host alone with no guests.
>
> System:
> SGI XE270, 8-core, Xeon X5570 (Nehalem), Hyperthreading turned off
>    

Shoot, was about to blame hyperthreading.

> Test, as before, was simply this for a kernel build.  The .config file has
> plenty of modules configured.
> time (make -j12&&  make -j12 modules)
>
>
>
> host only, no guest, baseline
> -----------------------------
> trial 1:
> real	5m44.823s
> user	28m45.725s
> sys	5m46.633s
>
> trial 2:
> real	5m34.438s
> user	28m14.347s
> sys	5m41.597s
>
>
> guest, 8 vcpu, 4096 mem, virtio, no cache param, disk device supplied in full
> -----------------------------------------------------------------------------
> trial 1:
> real	125m5.995s
> user	31m23.790s
> sys	9m17.602s
>
>
> trial 2 (changed to 7168 mb memory for the guest):
> real	120m48.431s
> user	14m38.967s
> sys	6m12.437s
>
>
> That's real strange...  The 'time' command is showing whacked out results.
>
> I then watched a run by hand and counted it at about 10 minutes.  However,
> this third run had the proper time!  So whatever the weirdness is, it doesn't
> happen every time:
>
> real	9m49.802s
> user	24m46.009s
> sys	8m10.349s
>
> I decided this could be related to ntp running as I saw this in messages:
> Jun 18 16:34:23 localhost ntpd[1916]: time reset -0.229209 s
> Jun 18 16:34:23 localhost ntpd[1916]: kernel time sync status change 0001
> Jun 18 16:40:17 localhost ntpd[1916]: synchronized to 128.162.244.1, stratum 2
>
> and earlier:
>
> Jun 18 16:19:09 localhost ntpd[1916]: synchronized to 128.162.244.1, stratum 2
> Jun 18 16:19:09 localhost ntpd[1916]: time reset +6609.851122 s
> Jun 18 16:23:39 localhost ntpd[1916]: synchronized to 128.162.244.1, stratum 2
> Jun 18 16:24:04 localhost kernel: hrtimer: interrupt too slow, forcing clock min delta to 62725995 ns
>
>
> I then installed all F11 updates in the guest and tried again (host had
> updates all along).  I got these strange results, strange because of the
> timing difference.  I didn't "watch a non-computer clock" for these.
>    

kvm guests should have an accurate clock without ntp in the guest 
(/sys/.../current_clocksource should say 'kvmclock').


Can you post kvm_stat output during the run?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-28 14:17     ` Avi Kivity
@ 2009-06-28 19:05       ` Erik Jacobson
  2009-06-28 21:28         ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Erik Jacobson @ 2009-06-28 19:05 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, kvm

>>   * In some of the timing runs on this system, the "real time" reported by
>>     the time command was off by 10 to 11 times.  Issues were found in
>>     the messages file that seemed to relate to this including HUGE time
>>     adjustments by NTP and kernel hrtimer 'interrupt too slow' messages.
>>     This specific problem seems to be intermittent.
> This is on the host? It can easily ruin your day.

This was in the guest.

>> System:
>> SGI XE270, 8-core, Xeon X5570 (Nehalem), Hyperthreading turned off
> Shoot, was about to blame hyperthreading.

I'll keep it off for the next attempted run.

> kvm guests should have an accurate clock without ntp in the guest  
> (/sys/.../current_clocksource should say 'kvmclock').

OK thanks.

> Can you post kvm_stat output during the run?

Sure, I'll try to get time on the system again next week and post in
to the thread again.  We'll still have the issue with the non-sequential
nodes and incorrect representation of memory for this two-socket
Nehalem system.  I don't think that patch has made it in to the kernel.

Thanks for replying back.

If you have any other things you'd suggest trying, I'm game to give it a
whirl.  Someone suggested trying to export a whole PCI device to the guest.
I won't be able to do that on this machine, maybe later when I have physical
access to the system.  Besides, that exercise might not poke at what I'm
interested in anyway.

Others suggested some potential settings EPT (Extended Page Table) and
VPID (Virtual Path Identifier?) but I don't see where these settings are
exposed (they aren't, for example, in this system's BIOS).

More to come then.  Thanks.

Erik

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-28 19:05       ` Erik Jacobson
@ 2009-06-28 21:28         ` Avi Kivity
  2009-07-01 21:41           ` Erik Jacobson
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-06-28 21:28 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: kvm

On 06/28/2009 10:05 PM, Erik Jacobson wrote:
>>>    * In some of the timing runs on this system, the "real time" reported by
>>>      the time command was off by 10 to 11 times.  Issues were found in
>>>      the messages file that seemed to relate to this including HUGE time
>>>      adjustments by NTP and kernel hrtimer 'interrupt too slow' messages.
>>>      This specific problem seems to be intermittent.
>>>        
>> This is on the host? It can easily ruin your day.
>>      
>
> This was in the guest.
>    

Ok.  Please keep ntp off in the guest and verify the guest says it uses 
kvmclock.


>> Can you post kvm_stat output during the run?
>>      
>
> Sure, I'll try to get time on the system again next week and post in
> to the thread again.  We'll still have the issue with the non-sequential
> nodes and incorrect representation of memory for this two-socket
> Nehalem system.  I don't think that patch has made it in to the kernel.
>
> Thanks for replying back.
>
> If you have any other things you'd suggest trying, I'm game to give it a
> whirl.  Someone suggested trying to export a whole PCI device to the guest.
> I won't be able to do that on this machine, maybe later when I have physical
> access to the system.  Besides, that exercise might not poke at what I'm
> interested in anyway.
>    


Shouldn't be needed.  It's sufficient to export an LVM volume with 
cache=none:

    -drive file=/dev/vg/lv,cache=none,if=virtio

> Others suggested some potential settings EPT (Extended Page Table) and
> VPID (Virtual Path Identifier?) but I don't see where these settings are
> exposed (they aren't, for example, in this system's BIOS).
>    

Look in /sys/modules/kvm_intel/paramters, ept and vpid should default to 
enabled.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-28 21:28         ` Avi Kivity
@ 2009-07-01 21:41           ` Erik Jacobson
  2009-07-02  5:48             ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Erik Jacobson @ 2009-07-01 21:41 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, kvm

I wanted to post in to the thread the lastest test run.

Avi Kivity provided some ideas to try.  I had mixed luck.  I'd like to try
this again if we have any thoughts on the vpid/ept issue, or any other
ideas for drilling down on this.  Avi Kivity mentioned LVM in the thread.
I continued to just export the whole /dev/sdb to the guest. I'm happy to
try LVM in some form if we think it would help?

As indicated, I still had trouble locating information about ept and vpid
(see below).  Several Fedora11 packages were updated in both host and guest 
since the last run, so we're at current F11+updates.  I don't know enough
about some of these kvm settings to do much beyond what I'm told to try.

System hardware:
 * Same machines as used before, extensive system detail posted earlier in
   the thread.
 * Same Nehalem based XE270 system as before
 * Hyperthreading disabled
 * System was the same as before.  Host has 8 cores, 2 sockets, and is
   Nehalem.  (Intel(R) Xeon(R) CPU X5570  @ 2.93GHz)
 * root and workarea disks are nothing special no LVM used.
 * 8gb host memory

System Settings:
 * chkconfig ntpd off
 * service ntpd stop
 * $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
kvm-clock
 * ensured kvm_stat was available on the host
 * I could NOT find vpid and ept parameters on the host.  They weren't here:
   /sys/module/kvm_intel/parameters
   nor here
   /sys/module/kvm/parameters
   So the check for those parameters resulted in no information.
   Didn't see them elsewhere either:
   # pwd
   /sys
   # find . -name vpid -print
   # find . -name ept -print

 * Version information:
   kernel host and guest: 2.6.29.5-191.fc11.x86_64
   kvm: qemu-kvm-0.10.5-3.fc11.x86_64, qemu-system-x86-0.10.5-3.fc11.x86_64

 * Build area disk is the whole /dev/sdb drive exported to the guest.  I did 
   not use LVM.

 * Root is a raw disk image, pre-allocated

 * Host and guest are fedora11 with all current updates applied.

 * 8 cpu, 4gb memory exported to guest.

 * All disks exported virtio


I had done some stuff to set up the test including a build I didn't count.

GUEST time (make -j12&&  make -j12 modules), work area disk no cache param
--------------------------------------------------------------------------
kvm_stat output BEFORE running this test:

kvm statistics

 efer_reload                 13       0
 exits                 27145076    1142
 fpu_reload             1298729       0
 halt_exits             2152011     189
 halt_wakeup             494689     123
 host_state_reload	4998646     837
 hypercalls                   0       0
 insn_emulation        10165593     302
 insn_emulation_fail          0       0
 invlpg                       0       0
 io_exits               2096834     643
 irq_exits              6469071       8
 irq_injections         4765189     190
 irq_window              279385       0
 largepages                   0       0
 mmio_exits                   0       0
 mmu_cache_miss           18670       0
 mmu_flooded                  0       0
 mmu_pde_zapped               0       0
 mmu_pte_updated              0       0
 mmu_pte_write            10440       0
 mmu_recycled                 0       0


qemu-kvm command:
/usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on -drive file=/dev/sdb,if=virtio,index=1 -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -usbdevice tablet -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22

test run timing:
real	12m36.165s
user	27m28.976s
sys	8m32.245s


kvm_stat output after this test run
kvm statistics

 efer_reload                 13       0
 exits                 47097981    2003
 fpu_reload             2168308       0
 halt_exits             3378761     301
 halt_wakeup             707171     241
 host_state_reload	7545990    1538
 hypercalls                   0       0
 insn_emulation        17809066     462
 insn_emulation_fail          0       0
 invlpg                       0       0
 io_exits               2801221    1232
 irq_exits             11959063       7
 irq_injections         8395980     304
 irq_window              531641       3
 largepages                   0       0
 mmio_exits                   0       0
 mmu_cache_miss           28419       0
 mmu_flooded                  0       0
 mmu_pde_zapped               0       0
 mmu_pte_updated              0       0
 mmu_pte_write            10440       0
 mmu_recycled              7193       0





GUEST time (make -j12&&  make -j12 modules), work area disk, cache=none
-----------------------------------------------------------------------
qemu-kvm command:
/usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on -drive file=/dev/sdb,if=virtio,index=1,cache=none -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -usbdevice tablet -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22

kvm_stat output BEFORE running this test:
kvm statistics

 efer_reload                 13       0
 exits                  1384042    1097
 fpu_reload              168399       0
 halt_exits              142790     179
 halt_wakeup              60526     123
 host_state_reload	 561970     827
 hypercalls                   0       0
 insn_emulation          553390     269
 insn_emulation_fail          0       0
 invlpg                       0       0
 io_exits                390616     643
 irq_exits                42529       5
 irq_injections          165009     180
 irq_window                8618       1
 largepages                   0       0
 mmio_exits                   0       0
 mmu_cache_miss            1664       0
 mmu_flooded                  0       0
 mmu_pde_zapped               0       0
 mmu_pte_updated              0       0
 mmu_pte_write             5000       0
 mmu_recycled                 0       0


test run timing:

real	11m25.250s
user	29m43.453s
sys	8m49.591s


kvm_stat output after this test run

kvm statistics

 efer_reload                 13       0
 exits                 23320965    1095
 fpu_reload             1354712       0
 halt_exits             1510807     180
 halt_wakeup             385195     123
 host_state_reload	3541625     823
 hypercalls                   0       0
 insn_emulation         9056031     272
 insn_emulation_fail          0       0
 invlpg                       0       0
 io_exits               1318308     643
 irq_exits              5553479       0
 irq_injections         4108969     180
 irq_window              279961       0
 largepages                   0       0
 mmio_exits                   0       0
 mmu_cache_miss           14225       0
 mmu_flooded                  0       0
 mmu_pde_zapped               0       0
 mmu_pte_updated              0       0
 mmu_pte_write             5000       0
 mmu_recycled                 0       0



HOST time (make -j12&&  make -j12 modules) with no guest running
----------------------------------------------------------------
real	6m50.936s
user	29m12.051s
sys	5m50.867s


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-01 21:41           ` Erik Jacobson
@ 2009-07-02  5:48             ` Avi Kivity
  2009-07-02  9:41               ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-07-02  5:48 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: kvm

On 07/02/2009 12:41 AM, Erik Jacobson wrote:
> I wanted to post in to the thread the lastest test run.
>
> Avi Kivity provided some ideas to try.  I had mixed luck.  I'd like to try
> this again if we have any thoughts on the vpid/ept issue, or any other
> ideas for drilling down on this.  Avi Kivity mentioned LVM in the thread.
> I continued to just export the whole /dev/sdb to the guest. I'm happy to
> try LVM in some form if we think it would help?
>    

Exporting an entire drive is even better than LVM (in terms of 
performance; flexibility obviously suffers).  Just make sure to use 
cache=none (which I see in your command line below).

>   * I could NOT find vpid and ept parameters on the host.  They weren't here:
>     /sys/module/kvm_intel/parameters
>     nor here
>     /sys/module/kvm/parameters
>     So the check for those parameters resulted in no information.
>     Didn't see them elsewhere either:
>     # pwd
>     /sys
>     # find . -name vpid -print
>     # find . -name ept -print
>
>    

Apparently the parameters were only exposed in 2.6.30.  Previously they 
were only available during modprobe.  Since you're using nehalem, let's 
assume they're set correctly (since that's the default).

>
> I had done some stuff to set up the test including a build I didn't count.
>
> GUEST time (make -j12&&   make -j12 modules), work area disk no cache param
> --------------------------------------------------------------------------
> kvm_stat output BEFORE running this test:
>
> kvm statistics
>
>   efer_reload                 13       0
>   exits                 27145076    1142
>   fpu_reload             1298729       0
>   halt_exits             2152011     189
>   halt_wakeup             494689     123
>   host_state_reload	4998646     837
>   hypercalls                   0       0
>   insn_emulation        10165593     302
>   insn_emulation_fail          0       0
>   invlpg                       0       0
>   io_exits               2096834     643
>   irq_exits              6469071       8
>   irq_injections         4765189     190
>   irq_window              279385       0
>   largepages                   0       0
>   mmio_exits                   0       0
>   mmu_cache_miss           18670       0
>   mmu_flooded                  0       0
>   mmu_pde_zapped               0       0
>   mmu_pte_updated              0       0
>   mmu_pte_write            10440       0
>   mmu_recycled                 0       0
>    

Nice and quiet.

> qemu-kvm command:
> /usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on -drive file=/dev/sdb,if=virtio,index=1 -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -usbdevice tablet -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22
>    

-usbdevice tablet is known to cause large interrupt loads.  I suggest 
dropping it.  If it helps your vnc session, drop your vnc client and use 
vinagre instead.

> test run timing:
> real	12m36.165s
> user	27m28.976s
> sys	8m32.245s
>    

12 minutes real vs 35 cpu minutes -> scaling only 3:1 on smp 8.

>
> kvm_stat output after this test run
> kvm statistics
>
>   efer_reload                 13       0
>   exits                 47097981    2003
>   fpu_reload             2168308       0
>   halt_exits             3378761     301
>   halt_wakeup             707171     241
>   host_state_reload	7545990    1538
>   hypercalls                   0       0
>   insn_emulation        17809066     462
>   insn_emulation_fail          0       0
>   invlpg                       0       0
>   io_exits               2801221    1232
>   irq_exits             11959063       7
>   irq_injections         8395980     304
>   irq_window              531641       3
>   largepages                   0       0
>   mmio_exits                   0       0
>   mmu_cache_miss           28419       0
>   mmu_flooded                  0       0
>   mmu_pde_zapped               0       0
>   mmu_pte_updated              0       0
>   mmu_pte_write            10440       0
>   mmu_recycled              7193       0
>
>    

Nice and quiet too, but what's needed is kvm_stat (or kvm_stat -1) 
during the run.  Many of the 47M exists are unaccounted for, there's a 
lack in the stats gathering code.

vmstat 1 on host and guest during the run would also help.

> HOST time (make -j12&&   make -j12 modules) with no guest running
> ----------------------------------------------------------------
> real	6m50.936s
> user	29m12.051s
> sys	5m50.867s
>
>    

35 minutes cpu run on 7 minutes real time, so scaling 1:7.  User time 
almost the same, system time different but not enough to account for the 
large difference in run time.

I'm due to get my own Nehalem next week, I'll try to reproduce your 
results here.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-02  5:48             ` Avi Kivity
@ 2009-07-02  9:41               ` Avi Kivity
  2009-07-03 15:43                 ` Mark McLoughlin
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-07-02  9:41 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: kvm

On 07/02/2009 08:48 AM, Avi Kivity wrote:
>> HOST time (make -j12&&   make -j12 modules) with no guest running
>> ----------------------------------------------------------------
>> real    6m50.936s
>> user    29m12.051s
>> sys    5m50.867s
>>
>
> 35 minutes cpu run on 7 minutes real time, so scaling 1:7.  User time 
> almost the same, system time different but not enough to account for 
> the large difference in run time.
>
> I'm due to get my own Nehalem next week, I'll try to reproduce your 
> results here.
>

I reproduced this on a 2x4 barcelona, I get 6.6x scaling on the guest 
compared to your 7.2x on the host.  This is with a kvm.git host kernel.  
Once thing that changed is an improvement in cfq with multiple threads; 
try setting the host io scheduler for /dev/sdb to deadline (together 
with dropping -usbdevice tablet).


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-06-14  9:33 ` Avi Kivity
  2009-06-15 14:15   ` Erik Jacobson
  2009-06-18 23:07   ` Erik Jacobson
@ 2009-07-03 10:41   ` Matty
  2009-07-05  8:07     ` Avi Kivity
  2 siblings, 1 reply; 25+ messages in thread
From: Matty @ 2009-07-03 10:41 UTC (permalink / raw)
  To: kvm

On Sun, Jun 14, 2009 at 5:33 AM, Avi Kivity<avi@redhat.com> wrote:
>> I performed tests on two different systems to be sure it isn't related to
>> specific hardware.
>>
>
> What is the host cpu type?  On pre-Nehalem/Barcelona processors kvm has poor
> scalability in mmu intensive workloads like kernel builds.

Hey Avi,

Are there plans to address these scalability issues for
pre-Nehalem/Barcelona processors? I poked around the KVM website, and
I don't see anything related to this.

Thanks,
- Ryan
--
http://prefetch.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-02  9:41               ` Avi Kivity
@ 2009-07-03 15:43                 ` Mark McLoughlin
  2009-07-03 16:28                   ` Erik Jacobson
  2009-07-09  2:36                   ` Erik Jacobson
  0 siblings, 2 replies; 25+ messages in thread
From: Mark McLoughlin @ 2009-07-03 15:43 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, kvm

On Thu, 2009-07-02 at 12:41 +0300, Avi Kivity wrote:
> On 07/02/2009 08:48 AM, Avi Kivity wrote:
> >> HOST time (make -j12&&   make -j12 modules) with no guest running
> >> ----------------------------------------------------------------
> >> real    6m50.936s
> >> user    29m12.051s
> >> sys    5m50.867s
> >>
> >
> > 35 minutes cpu run on 7 minutes real time, so scaling 1:7.  User time 
> > almost the same, system time different but not enough to account for 
> > the large difference in run time.
> >
> > I'm due to get my own Nehalem next week, I'll try to reproduce your 
> > results here.
> >
> 
> I reproduced this on a 2x4 barcelona, I get 6.6x scaling on the guest 
> compared to your 7.2x on the host.  This is with a kvm.git host kernel.  
> Once thing that changed is an improvement in cfq with multiple threads; 
> try setting the host io scheduler for /dev/sdb to deadline (together 
> with dropping -usbdevice tablet).

Haven't followed the thread in great detail, but has anyone tried
putting the virtio disk back into rotational mode?

See also:

  https://bugzilla.redhat.com/509383

and:

  http://lkml.org/lkml/2008/10/27/84

Cheers,
Mark.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-03 15:43                 ` Mark McLoughlin
@ 2009-07-03 16:28                   ` Erik Jacobson
  2009-07-09  2:36                   ` Erik Jacobson
  1 sibling, 0 replies; 25+ messages in thread
From: Erik Jacobson @ 2009-07-03 16:28 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: Avi Kivity, Erik Jacobson, kvm

> Haven't followed the thread in great detail, but has anyone tried
> putting the virtio disk back into rotational mode?

Thanks Mark.

I have not tried this yet.

To be honest, I wasn't fully understanding some of Avi's last comments and
was waiting for one of my co-workers to be available to help me parse them.

I plan to perform Avi's suggestions, plus this rotational idea, early next
week.  I'll update the BZs too.

By the way, a request to work-order a system for a few weeks to play with this
stuff was approved.  This means I'll have easier access to a nicely configured 
multi-socket Nehalem system for a while, starting perhaps a week or two from
now.  At this moment, I have to beg for time slices.

Even before I get temporary access to that box, I'm happy to run tests
for people who don't have access to this hardware right now.  I'll have to be
a go-between though as my idea to put it outside the firewall didn't quite
work out.  Really, I'm looking to encourage scalability here that will
help big NUMA systems do virtualization well.  I'm not a big kernel hacker
like Jes (I just dabble) but hopefully I can find my own way to contribute.

Thanks again.

> 
> See also:
> 
>   https://bugzilla.redhat.com/509383
> 
> and:
> 
>   http://lkml.org/lkml/2008/10/27/84
> 
> Cheers,
> Mark.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Erik Jacobson - Linux System Software - SGI - Eagan, Minnesota

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-03 10:41   ` Matty
@ 2009-07-05  8:07     ` Avi Kivity
  0 siblings, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2009-07-05  8:07 UTC (permalink / raw)
  To: Matty; +Cc: kvm

On 07/03/2009 01:41 PM, Matty wrote:
>> What is the host cpu type?  On pre-Nehalem/Barcelona processors kvm has poor
>> scalability in mmu intensive workloads like kernel builds.
>>      
>
> Hey Avi,
>
> Are there plans to address these scalability issues for
> pre-Nehalem/Barcelona processors? I poked around the KVM website, and
> I don't see anything related to this.
>    

Not really.  While it is possible to scale the kvm mmu for these older 
processors, I don't think it's really worthwhile in terms of effort and 
complexity vs. return.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-03 15:43                 ` Mark McLoughlin
  2009-07-03 16:28                   ` Erik Jacobson
@ 2009-07-09  2:36                   ` Erik Jacobson
  2009-07-09  7:48                     ` Avi Kivity
  1 sibling, 1 reply; 25+ messages in thread
From: Erik Jacobson @ 2009-07-09  2:36 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: Avi Kivity, Erik Jacobson, kvm

> Haven't followed the thread in great detail, but has anyone tried
> putting the virtio disk back into rotational mode?

Hello.  I haven't had a chance to try all the suggestions in the thread
so far.  However, I did just run some tests with block queue rotation
settings tonight.

For the problem where mkfs.ext3 on a virtio disk image (raw image, not
pre-allocated) the timing went from like 27 minutes for a 10gb fs down
to just over 2 minutes.  So that was a huge difference.

For the linux kernel build test, there was a difference, but less dramatic.
I'd say the performance is still below what I expected.  I haven't given
up yet, there is more to try in the thread.  I just wanted to post these
results.

The HW in use is the same as in the rest of the tread, including the
virtio disk type.  Since my last post, fedora 11 updates have been
applied to the system.

Timing with the rotational stuff set to 1...

real    14m13.015s
user    29m42.162s
sys     8m37.416s

To confirm this was really better, I halted the virtual machine and restarted
it without doing setting the rotational values to 1.  I got this timing:

real    16m50.829s
user    29m33.933s
sys     9m4.905s


And finally, to confirm the numbers on the host with no guest running...
The same disk/filesystem, now mounted on the host instead of the guest, gave
this timing:

real    6m13.398s
user    26m56.061s
sys     5m34.477s


For the guest runs, qemu command was as follows.  For later tests, I will
combine the queue rotation setting with the other suggestions.

/usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on -drive file=/dev/sdb,if=virtio,index=1 -drive file=/var/lib/libvirt/images/test.img,if=virtio,index=2 -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -usbdevice tablet -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-09  2:36                   ` Erik Jacobson
@ 2009-07-09  7:48                     ` Avi Kivity
  2009-07-09 18:01                       ` Erik Jacobson
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-07-09  7:48 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: Mark McLoughlin, kvm

On 07/09/2009 05:36 AM, Erik Jacobson wrote:
>> Haven't followed the thread in great detail, but has anyone tried
>> putting the virtio disk back into rotational mode?
>>      
>
> Hello.  I haven't had a chance to try all the suggestions in the thread
> so far.  However, I did just run some tests with block queue rotation
> settings tonight.
>
> For the problem where mkfs.ext3 on a virtio disk image (raw image, not
> pre-allocated) the timing went from like 27 minutes for a 10gb fs down
> to just over 2 minutes.  So that was a huge difference.
>
> For the linux kernel build test, there was a difference, but less dramatic.
> I'd say the performance is still below what I expected.  I haven't given
> up yet, there is more to try in the thread.  I just wanted to post these
> results.
>
> The HW in use is the same as in the rest of the tread, including the
> virtio disk type.  Since my last post, fedora 11 updates have been
> applied to the system.
>
> Timing with the rotational stuff set to 1...
>
> real    14m13.015s
> user    29m42.162s
> sys     8m37.416s
>    

(user + sys) / real = 2.7

> And finally, to confirm the numbers on the host with no guest running...
> The same disk/filesystem, now mounted on the host instead of the guest, gave
> this timing:
>
> real    6m13.398s
> user    26m56.061s
> sys     5m34.477s
>    

(user + sys) / real = 5.2

I got 6.something in a guest!

> For the guest runs, qemu command was as follows.  For later tests, I will
> combine the queue rotation setting with the other suggestions.
>
> /usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on -drive file=/dev/sdb,if=virtio,index=1 -drive file=/var/lib/libvirt/images/test.img,if=virtio,index=2 -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -usbdevice tablet -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22
>    

Please drop -usbdevice tablet and set the host I/O scheduler to 
deadline.  Add cache=none to the -drive options.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-09  7:48                     ` Avi Kivity
@ 2009-07-09 18:01                       ` Erik Jacobson
  2009-07-12  8:38                         ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Erik Jacobson @ 2009-07-09 18:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, Mark McLoughlin, kvm, Jes Sorensen

>> Timing with the rotational stuff set to 1...
>>
>> real    14m13.015s
>> user    29m42.162s
>> sys     8m37.416s
>
> (user + sys) / real = 2.7
>
>> And finally, to confirm the numbers on the host with no guest running...
>> The same disk/filesystem, now mounted on the host instead of the guest, gave
>> this timing:
>>
>> real    6m13.398s
>> user    26m56.061s
>> sys     5m34.477s
>>    
>
> (user + sys) / real = 5.2
>
> I got 6.something in a guest!

> Please drop -usbdevice tablet and set the host I/O scheduler to  
> deadline.  Add cache=none to the -drive options.

yes, these changes make a difference.


Before starting qemu-kvm, I did this to change the IO scheduler:
BEFORE:
# for f in /sys/block/sd*/queue/scheduler; do cat $f; done
noop anticipatory deadline [cfq] 
noop anticipatory deadline [cfq] 

SET:
# for f in /sys/block/sd*/queue/scheduler; do echo "deadline" > $f; done

CONFIRM:
# for f in /sys/block/sd*/queue/scheduler; do cat $f; done
noop anticipatory [deadline] cfq 
noop anticipatory [deadline] cfq 


qemu command line.  Note that usbtablet is off and cache=none is used in
drive options:

/usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on,cache=none -drive file=/dev/sdb,if=virtio,index=1,cache=none -drive file=/var/lib/libvirt/images/test.img,if=virtio,index=2,cache=none -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22


# rotation enabled this way in the guest, once the guest was started:
for f in /sys/block/vd*/queue/rotational; do echo 1 > $f; done

Test runs after make clean...
time (make -j12&&  make -j12 modules)

real	10m25.585s
user	26m36.450s
sys	8m14.776s

2nd trial (make clean followed by the same test again.
real	9m21.626s
user	26m42.144s
sys	8m14.532s


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-09 18:01                       ` Erik Jacobson
@ 2009-07-12  8:38                         ` Avi Kivity
  2009-07-16 12:20                           ` Jes Sorensen
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2009-07-12  8:38 UTC (permalink / raw)
  To: Erik Jacobson; +Cc: Mark McLoughlin, kvm, Jes Sorensen

On 07/09/2009 09:01 PM, Erik Jacobson wrote:
>
>> Please drop -usbdevice tablet and set the host I/O scheduler to
>> deadline.  Add cache=none to the -drive options.
>>      
>
> yes, these changes make a difference.
>
>
> Before starting qemu-kvm, I did this to change the IO scheduler:
> BEFORE:
> # for f in /sys/block/sd*/queue/scheduler; do cat $f; done
> noop anticipatory deadline [cfq]
> noop anticipatory deadline [cfq]
>
> SET:
> # for f in /sys/block/sd*/queue/scheduler; do echo "deadline">  $f; done
>
> CONFIRM:
> # for f in /sys/block/sd*/queue/scheduler; do cat $f; done
> noop anticipatory [deadline] cfq
> noop anticipatory [deadline] cfq
>
>
> qemu command line.  Note that usbtablet is off and cache=none is used in
> drive options:
>
> /usr/bin/qemu-kvm -M pc -m 4096 -smp 8 -name f11-test -uuid b7b4b7e4-9c07-22aa-0c95-d5c8a24176c5 -monitor pty -pidfile /var/run/libvirt/qemu//f11-test.pid -drive file=/var/lib/libvirt/images/f11-test.img,if=virtio,index=0,boot=on,cache=none -drive file=/dev/sdb,if=virtio,index=1,cache=none -drive file=/var/lib/libvirt/images/test.img,if=virtio,index=2,cache=none -net nic,macaddr=54:52:00:46:48:0e,model=virtio -net user -serial pty -parallel none -usb -vnc cct201:1 -soundhw es1370 -redir tcp:5555::22
>
>
> # rotation enabled this way in the guest, once the guest was started:
> for f in /sys/block/vd*/queue/rotational; do echo 1>  $f; done
>
> Test runs after make clean...
> time (make -j12&&   make -j12 modules)
>
> real	10m25.585s
> user	26m36.450s
> sys	8m14.776s
>
> 2nd trial (make clean followed by the same test again.
> real	9m21.626s
> user	26m42.144s
> sys	8m14.532s
>
>    

That's a scaling of 3.7, still pretty far from the host and even farther 
than my results.

Is the numa factor of this machine larger than usual?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-12  8:38                         ` Avi Kivity
@ 2009-07-16 12:20                           ` Jes Sorensen
  2009-07-25 14:33                             ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Jes Sorensen @ 2009-07-16 12:20 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Erik Jacobson, Mark McLoughlin, kvm

On 07/12/2009 10:38 AM, Avi Kivity wrote:
> On 07/09/2009 09:01 PM, Erik Jacobson wrote:
>> Test runs after make clean...
>> time (make -j12&& make -j12 modules)
>>
>> real 10m25.585s
>> user 26m36.450s
>> sys 8m14.776s
>>
>> 2nd trial (make clean followed by the same test again.
>> real 9m21.626s
>> user 26m42.144s
>> sys 8m14.532s
>
> That's a scaling of 3.7, still pretty far from the host and even farther
> than my results.
>
> Is the numa factor of this machine larger than usual?

I didn't see a reply to this one, so I will just add what I know.

I believe Erik ran the tests on what we sell as an XE270 system. It's
really just a standard Intel or Supermicro motherboard in a box that
has been painted purple (or blue/green now I guess), so it really
shouldn't have extra numa factors compared to other Nehalem systems.

Cheers,
Jes


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: slow guest performance with build load, looking for ideas
  2009-07-16 12:20                           ` Jes Sorensen
@ 2009-07-25 14:33                             ` Avi Kivity
  0 siblings, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2009-07-25 14:33 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: Erik Jacobson, Mark McLoughlin, kvm

On 07/16/2009 03:20 PM, Jes Sorensen wrote:
> It's really just a standard Intel or Supermicro motherboard in a box that
> has been painted purple (or blue/green now I guess), so it really
> shouldn't have extra numa factors compared to other Nehalem systems.

Have you been transferred to marketing?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2009-07-25 14:34 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-12 21:04 slow guest performance with build load, looking for ideas Erik Jacobson
2009-06-14  9:33 ` Avi Kivity
2009-06-15 14:15   ` Erik Jacobson
2009-06-15 14:24     ` Avi Kivity
2009-06-15 15:25       ` Michael Tokarev
2009-06-15 15:27         ` Avi Kivity
2009-06-16  7:03           ` Michael Tokarev
2009-06-16  8:07             ` Avi Kivity
2009-06-18 23:07   ` Erik Jacobson
2009-06-28 14:17     ` Avi Kivity
2009-06-28 19:05       ` Erik Jacobson
2009-06-28 21:28         ` Avi Kivity
2009-07-01 21:41           ` Erik Jacobson
2009-07-02  5:48             ` Avi Kivity
2009-07-02  9:41               ` Avi Kivity
2009-07-03 15:43                 ` Mark McLoughlin
2009-07-03 16:28                   ` Erik Jacobson
2009-07-09  2:36                   ` Erik Jacobson
2009-07-09  7:48                     ` Avi Kivity
2009-07-09 18:01                       ` Erik Jacobson
2009-07-12  8:38                         ` Avi Kivity
2009-07-16 12:20                           ` Jes Sorensen
2009-07-25 14:33                             ` Avi Kivity
2009-07-03 10:41   ` Matty
2009-07-05  8:07     ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.