All of lore.kernel.org
 help / color / mirror / Atom feed
* KVM performance vs. Xen
@ 2009-04-29 14:41 Andrew Theurer
  2009-04-29 15:20 ` Nakajima, Jun
  2009-04-30  8:56 ` Avi Kivity
  0 siblings, 2 replies; 21+ messages in thread
From: Andrew Theurer @ 2009-04-29 14:41 UTC (permalink / raw)
  To: kvm-devel

I wanted to share some performance data for KVM and Xen.  I thought it
would be interesting to share some performance results especially
compared to Xen, using a more complex situation like heterogeneous
server consolidation.

The Workload:
The workload is one that simulates a consolidation of servers on to a
single host.  There are 3 server types: web, imap, and app (j2ee).  In
addition, there are other "helper" servers which are also consolidated:
a db server, which helps out with the app server, and an nfs server,
which helps out with the web server (a portion of the docroot is nfs
mounted).  There is also one other server that is simply idle.  All 6
servers make up one set.  The first 3 server types are sent requests,
which in turn may send requests to the db and nfs helper servers.  The
request rate is throttled to produce a fixed amount of work.  In order
to increase utilization on the host, more sets of these servers are
used.  The clients which send requests also have a response time
requirement which is monitored.  The following results have passed the
response time requirements.

The host hardware:
A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x
1 GB Ethenret

The host software:
Both Xen and KVM use the same host Linux OS, SLES11.  KVM uses the
2.6.27.19-5-default kernel and Xen uses the 2.6.27.19-5-xen kernel.  I
have tried 2.6.29 for KVM, but results are actually worse.  KVM modules
are rebuilt with kvm-85.  Qemu is also from kvm-85.  Xen version is
"3.3.1_18546_12-3.1".

The guest software:
All guests are RedHat 5.3.  The same disk images are used but different
kernels. Xen uses the RedHat Xen kernel and KVM uses 2.6.29 with all
paravirt build options enabled.  Both use PV I/O drivers.  Software
used: Apache, PHP, Java, Glassfish, Postgresql, and Dovecot.

Hypervisor configurations:
Xen guests use "phy:" for disks
KVM guests use "-drive" for disks with cache=none
KVM guests are backed with large pages
Memory and CPU sizings are different for each guest type, but a
particular guest's sizings are the same for Xen and KVM

The test run configuration:
There are 4 sets of servers used, so that's 24 guests total (4 idle
ones, 20 active ones).

Test Results:
The throughput is equal in these tests, as the clients throttle the work
(this is assuming you don't run out of a resource on the host).  What's
telling is the CPU used to do the same amount of work:

Xen:  52.85%
KVM:  66.93%

So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of
work. Here's the breakdown:

total    user    nice  system     irq softirq   guest
66.90    7.20    0.00   12.94    0.35    3.39   43.02

Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
overhead for virtualization.  I certainly don't expect it to be 0, but
55% seems a bit high.  So, what's the reason for this overhead?  At the
bottom is oprofile output of top functions for KVM.  Some observations:

1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?
2) cpu_physical_memory_rw due to not using preadv/pwritev?
3) vmx_[save|load]_host_state: I take it this is from guest switches?
We have 180,000 context switches a second.  Is this more than expected?
I wonder if schedstats can show why we context switch (need to let
someone else run, yielded, waiting on io, etc).


samples %       name                            app
385914891       61.3122 kvm-intel.ko            vmx_vcpu_run
11413793        1.8134  libc-2.9.so             /lib64/libc-2.9.so
8943054 1.4208  qemu-system-x86_64              cpu_physical_memory_rw
6877593 1.0927  kvm.ko                          kvm_arch_vcpu_ioctl_run
6469799 1.0279  qemu-system-x86_64              phys_page_find_alloc
5080474 0.8072  vmlinux-2.6.27.19-5-default     copy_user_generic_string
4154467 0.6600  kvm-intel.ko                    __vmx_load_host_state
3991060 0.6341  vmlinux-2.6.27.19-5-default     schedule
3455331 0.5490  kvm-intel.ko                    vmx_save_host_state
2582344 0.4103  vmlinux-2.6.27.19-5-default     find_busiest_group
2509543 0.3987  qemu-system-x86_64              main_loop_wait
2457476 0.3904  vmlinux-2.6.27.19-5-default     kfree
2395296 0.3806  kvm.ko                          kvm_set_irq
2385298 0.3790  vmlinux-2.6.27.19-5-default     fget_light
2229755 0.3543  vmlinux-2.6.27.19-5-default     __switch_to
2178739 0.3461  bnx2.ko                         bnx2_rx_int
2156418 0.3426  vmlinux-2.6.27.19-5-default     complete_signal
1854497 0.2946  qemu-system-x86_64              virtqueue_get_head
1833823 0.2913  vmlinux-2.6.27.19-5-default     try_to_wake_up
1816954 0.2887  qemu-system-x86_64              cpu_physical_memory_map
1776548 0.2822  oprofiled                       find_kernel_image
1737294 0.2760  vmlinux-2.6.27.19-5-default     kmem_cache_alloc
1662346 0.2641  qemu-system-x86_64              virtqueue_avail_bytes
1651070 0.2623  vmlinux-2.6.27.19-5-default     do_select
1643139 0.2611  vmlinux-2.6.27.19-5-default     update_curr
1640495 0.2606  vmlinux-2.6.27.19-5-default     kmem_cache_free
1606493 0.2552  libpthread-2.9.so               pthread_mutex_lock
1549536 0.2462  qemu-system-x86_64              lduw_phys
1535539 0.2440  vmlinux-2.6.27.19-5-default     tg_shares_up
1438468 0.2285  vmlinux-2.6.27.19-5-default     mwait_idle
1316461 0.2092  vmlinux-2.6.27.19-5-default     __down_read
1282486 0.2038  vmlinux-2.6.27.19-5-default     native_read_tsc
1226069 0.1948  oprofiled                       odb_update_node
1224551 0.1946  vmlinux-2.6.27.19-5-default     sched_clock_cpu
1222684 0.1943  tun.ko                          tun_chr_aio_read
1194034 0.1897  vmlinux-2.6.27.19-5-default     task_rq_lock
1186129 0.1884  kvm.ko                          x86_decode_insn
1131644 0.1798  bnx2.ko                         bnx2_start_xmit
1115575 0.1772  vmlinux-2.6.27.19-5-default     enqueue_hrtimer
1044329 0.1659  vmlinux-2.6.27.19-5-default     native_sched_clock
988546  0.1571  vmlinux-2.6.27.19-5-default     fput
980615  0.1558  vmlinux-2.6.27.19-5-default     __up_read
942270  0.1497  qemu-system-x86_64              kvm_run
925076  0.1470  kvm-intel.ko                    vmcs_writel
889220  0.1413  vmlinux-2.6.27.19-5-default     dev_queue_xmit
884786  0.1406  kvm.ko                          kvm_apic_has_interrupt
880421  0.1399  librt-2.9.so                    /lib64/librt-2.9.so
880306  0.1399  vmlinux-2.6.27.19-5-default     nf_iterate


-Andrew Theurer





^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: KVM performance vs. Xen
  2009-04-29 14:41 KVM performance vs. Xen Andrew Theurer
@ 2009-04-29 15:20 ` Nakajima, Jun
  2009-04-29 15:33   ` Andrew Theurer
  2009-04-30  8:56 ` Avi Kivity
  1 sibling, 1 reply; 21+ messages in thread
From: Nakajima, Jun @ 2009-04-29 15:20 UTC (permalink / raw)
  To: Andrew Theurer, kvm-devel

On 4/29/2009 7:41:50 AM, Andrew Theurer wrote:
> I wanted to share some performance data for KVM and Xen.  I thought it
> would be interesting to share some performance results especially
> compared to Xen, using a more complex situation like heterogeneous
> server consolidation.
>
> The Workload:
> The workload is one that simulates a consolidation of servers on to a
> single host.  There are 3 server types: web, imap, and app (j2ee).  In
> addition, there are other "helper" servers which are also
> consolidated: a db server, which helps out with the app server, and an
> nfs server, which helps out with the web server (a portion of the docroot is nfs mounted).
> There is also one other server that is simply idle.  All 6 servers
> make up one set.  The first 3 server types are sent requests, which in
> turn may send requests to the db and nfs helper servers.  The request
> rate is throttled to produce a fixed amount of work.  In order to
> increase utilization on the host, more sets of these servers are used.
> The clients which send requests also have a response time requirement
> which is monitored.  The following results have passed the response
> time requirements.
>
> The host hardware:
> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4
> x
> 1 GB Ethenret
>
> The host software:
> Both Xen and KVM use the same host Linux OS, SLES11.  KVM uses the
> 2.6.27.19-5-default kernel and Xen uses the 2.6.27.19-5-xen kernel.  I
> have tried 2.6.29 for KVM, but results are actually worse.  KVM
> modules are rebuilt with kvm-85.  Qemu is also from kvm-85.  Xen
> version is "3.3.1_18546_12-3.1".
>
> The guest software:
> All guests are RedHat 5.3.  The same disk images are used but
> different kernels. Xen uses the RedHat Xen kernel and KVM uses 2.6.29
> with all paravirt build options enabled.  Both use PV I/O drivers.  Software used:
> Apache, PHP, Java, Glassfish, Postgresql, and Dovecot.
>

Just for clarification. So are you using PV (Xen) Linux on Xen, not HVM? Is that 32-bit or 64-bit?

             .
Jun Nakajima | Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-29 15:20 ` Nakajima, Jun
@ 2009-04-29 15:33   ` Andrew Theurer
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Theurer @ 2009-04-29 15:33 UTC (permalink / raw)
  To: Nakajima, Jun; +Cc: kvm-devel

Nakajima, Jun wrote:
> On 4/29/2009 7:41:50 AM, Andrew Theurer wrote:
>   
>> I wanted to share some performance data for KVM and Xen.  I thought it
>> would be interesting to share some performance results especially
>> compared to Xen, using a more complex situation like heterogeneous
>> server consolidation.
>>
>> The Workload:
>> The workload is one that simulates a consolidation of servers on to a
>> single host.  There are 3 server types: web, imap, and app (j2ee).  In
>> addition, there are other "helper" servers which are also
>> consolidated: a db server, which helps out with the app server, and an
>> nfs server, which helps out with the web server (a portion of the docroot is nfs mounted).
>> There is also one other server that is simply idle.  All 6 servers
>> make up one set.  The first 3 server types are sent requests, which in
>> turn may send requests to the db and nfs helper servers.  The request
>> rate is throttled to produce a fixed amount of work.  In order to
>> increase utilization on the host, more sets of these servers are used.
>> The clients which send requests also have a response time requirement
>> which is monitored.  The following results have passed the response
>> time requirements.
>>
>> The host hardware:
>> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4
>> x
>> 1 GB Ethenret
>>
>> The host software:
>> Both Xen and KVM use the same host Linux OS, SLES11.  KVM uses the
>> 2.6.27.19-5-default kernel and Xen uses the 2.6.27.19-5-xen kernel.  I
>> have tried 2.6.29 for KVM, but results are actually worse.  KVM
>> modules are rebuilt with kvm-85.  Qemu is also from kvm-85.  Xen
>> version is "3.3.1_18546_12-3.1".
>>
>> The guest software:
>> All guests are RedHat 5.3.  The same disk images are used but
>> different kernels. Xen uses the RedHat Xen kernel and KVM uses 2.6.29
>> with all paravirt build options enabled.  Both use PV I/O drivers.  Software used:
>> Apache, PHP, Java, Glassfish, Postgresql, and Dovecot.
>>
>>     
>
> Just for clarification. So are you using PV (Xen) Linux on Xen, not HVM? Is that 32-bit or 64-bit?
>   
PV, 64-bit.

-Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-29 14:41 KVM performance vs. Xen Andrew Theurer
  2009-04-29 15:20 ` Nakajima, Jun
@ 2009-04-30  8:56 ` Avi Kivity
  2009-04-30 12:49   ` Andrew Theurer
                     ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: Avi Kivity @ 2009-04-30  8:56 UTC (permalink / raw)
  To: Andrew Theurer; +Cc: kvm-devel

Andrew Theurer wrote:
> I wanted to share some performance data for KVM and Xen.  I thought it
> would be interesting to share some performance results especially
> compared to Xen, using a more complex situation like heterogeneous
> server consolidation.
>
> The Workload:
> The workload is one that simulates a consolidation of servers on to a
> single host.  There are 3 server types: web, imap, and app (j2ee).  In
> addition, there are other "helper" servers which are also consolidated:
> a db server, which helps out with the app server, and an nfs server,
> which helps out with the web server (a portion of the docroot is nfs
> mounted).  There is also one other server that is simply idle.  All 6
> servers make up one set.  The first 3 server types are sent requests,
> which in turn may send requests to the db and nfs helper servers.  The
> request rate is throttled to produce a fixed amount of work.  In order
> to increase utilization on the host, more sets of these servers are
> used.  The clients which send requests also have a response time
> requirement which is monitored.  The following results have passed the
> response time requirements.
>

What's the typical I/O load (disk and network bandwidth) while the tests 
are running?

> The host hardware:
> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x
> 1 GB Ethenret

CPU time measurements with SMT can vary wildly if the system is not 
fully loaded.  If the scheduler happens to schedule two threads on a 
single core, both of these threads will generate less work compared to 
if they were scheduled on different cores.


> Test Results:
> The throughput is equal in these tests, as the clients throttle the work
> (this is assuming you don't run out of a resource on the host).  What's
> telling is the CPU used to do the same amount of work:
>
> Xen:  52.85%
> KVM:  66.93%
>
> So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of
> work. Here's the breakdown:
>
> total    user    nice  system     irq softirq   guest
> 66.90    7.20    0.00   12.94    0.35    3.39   43.02
>
> Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
> overhead for virtualization.  I certainly don't expect it to be 0, but
> 55% seems a bit high.  So, what's the reason for this overhead?  At the
> bottom is oprofile output of top functions for KVM.  Some observations:
>
> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
> Does that seems a bit excessive?

Yes, it is.  If there is a lot of I/O, this might be due to the thread 
pool used for I/O.

> 2) cpu_physical_memory_rw due to not using preadv/pwritev?

I think both virtio-net and virtio-blk use memcpy().

> 3) vmx_[save|load]_host_state: I take it this is from guest switches?

These are called when you context-switch from a guest, and, much more 
frequently, when you enter qemu.

> We have 180,000 context switches a second.  Is this more than expected?


Way more.  Across 16 logical cpus, this is >10,000 cs/sec/cpu.

> I wonder if schedstats can show why we context switch (need to let
> someone else run, yielded, waiting on io, etc).
>

Yes, there is a scheduler tracer, though I have no idea how to operate it.

Do you have kvm_stat logs?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30  8:56 ` Avi Kivity
@ 2009-04-30 12:49   ` Andrew Theurer
  2009-04-30 13:02     ` Avi Kivity
  2009-04-30 13:45   ` Anthony Liguori
  2009-04-30 16:41   ` Marcelo Tosatti
  2 siblings, 1 reply; 21+ messages in thread
From: Andrew Theurer @ 2009-04-30 12:49 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel

Avi Kivity wrote:
> Andrew Theurer wrote:
>> I wanted to share some performance data for KVM and Xen.  I thought it
>> would be interesting to share some performance results especially
>> compared to Xen, using a more complex situation like heterogeneous
>> server consolidation.
>>
>> The Workload:
>> The workload is one that simulates a consolidation of servers on to a
>> single host.  There are 3 server types: web, imap, and app (j2ee).  In
>> addition, there are other "helper" servers which are also consolidated:
>> a db server, which helps out with the app server, and an nfs server,
>> which helps out with the web server (a portion of the docroot is nfs
>> mounted).  There is also one other server that is simply idle.  All 6
>> servers make up one set.  The first 3 server types are sent requests,
>> which in turn may send requests to the db and nfs helper servers.  The
>> request rate is throttled to produce a fixed amount of work.  In order
>> to increase utilization on the host, more sets of these servers are
>> used.  The clients which send requests also have a response time
>> requirement which is monitored.  The following results have passed the
>> response time requirements.
>>
>
> What's the typical I/O load (disk and network bandwidth) while the 
> tests are running?
This is average thrgoughput:
network:    Tx: 79 MB/sec  Rx: 5 MB/sec
disk:    read: 17 MB/sec  write: 40 MB/sec
>
>> The host hardware:
>> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x
>> 1 GB Ethenret
>
> CPU time measurements with SMT can vary wildly if the system is not 
> fully loaded.  If the scheduler happens to schedule two threads on a 
> single core, both of these threads will generate less work compared to 
> if they were scheduled on different cores.
Understood.  Even if at low loads, the scheduler does the right thing 
and spreads out to all the cores first, once it goes beyond 50% util, 
the CPU util can climb at a much higher rate (compared to a linear 
increase in work) because it then starts scheduling 2 threads per core, 
and each thread can do less work.  I have always wanted something which 
could more accurately show the utilization of a processor core, but I 
guess we have to use what we have today.  I will run again with SMT off 
to see what we get.
>
>
>> Test Results:
>> The throughput is equal in these tests, as the clients throttle the work
>> (this is assuming you don't run out of a resource on the host).  What's
>> telling is the CPU used to do the same amount of work:
>>
>> Xen:  52.85%
>> KVM:  66.93%
>>
>> So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of
>> work. Here's the breakdown:
>>
>> total    user    nice  system     irq softirq   guest
>> 66.90    7.20    0.00   12.94    0.35    3.39   43.02
>>
>> Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
>> overhead for virtualization.  I certainly don't expect it to be 0, but
>> 55% seems a bit high.  So, what's the reason for this overhead?  At the
>> bottom is oprofile output of top functions for KVM.  Some observations:
>>
>> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
>> Does that seems a bit excessive?
>
> Yes, it is.  If there is a lot of I/O, this might be due to the thread 
> pool used for I/O.
I have a older patch which makes a small change to posix_aio_thread.c by 
trying to keep the thread pool size a bit lower than it is today.  I 
will dust that off and see if it helps.
>
>> 2) cpu_physical_memory_rw due to not using preadv/pwritev?
>
> I think both virtio-net and virtio-blk use memcpy().
>
>> 3) vmx_[save|load]_host_state: I take it this is from guest switches?
>
> These are called when you context-switch from a guest, and, much more 
> frequently, when you enter qemu.
>
>> We have 180,000 context switches a second.  Is this more than expected?
>
>
> Way more.  Across 16 logical cpus, this is >10,000 cs/sec/cpu.
>
>> I wonder if schedstats can show why we context switch (need to let
>> someone else run, yielded, waiting on io, etc).
>>
>
> Yes, there is a scheduler tracer, though I have no idea how to operate 
> it.
>
> Do you have kvm_stat logs?
Sorry, I don't, but I'll run that next time.  BTW, I did not notice a 
batch/log mode the last time I ram kvm_stat.  Or maybe it was not 
obvious to me.  Is there an ideal way to run kvm_stat without a curses 
like output?

-Andrew



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 12:49   ` Andrew Theurer
@ 2009-04-30 13:02     ` Avi Kivity
  2009-04-30 13:44       ` Andrew Theurer
  0 siblings, 1 reply; 21+ messages in thread
From: Avi Kivity @ 2009-04-30 13:02 UTC (permalink / raw)
  To: Andrew Theurer; +Cc: kvm-devel

Andrew Theurer wrote:
> Avi Kivity wrote:
>>>
>>
>> What's the typical I/O load (disk and network bandwidth) while the 
>> tests are running?
> This is average thrgoughput:
> network:    Tx: 79 MB/sec  Rx: 5 MB/sec

MB as in Byte or Mb as in bit?

> disk:    read: 17 MB/sec  write: 40 MB/sec

This could definitely cause the extra load, especially if it's many 
small requests (compared to a few large ones).

>>> The host hardware:
>>> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 
>>> 4 x
>>> 1 GB Ethenret
>>
>> CPU time measurements with SMT can vary wildly if the system is not 
>> fully loaded.  If the scheduler happens to schedule two threads on a 
>> single core, both of these threads will generate less work compared 
>> to if they were scheduled on different cores.
> Understood.  Even if at low loads, the scheduler does the right thing 
> and spreads out to all the cores first, once it goes beyond 50% util, 
> the CPU util can climb at a much higher rate (compared to a linear 
> increase in work) because it then starts scheduling 2 threads per 
> core, and each thread can do less work.  I have always wanted 
> something which could more accurately show the utilization of a 
> processor core, but I guess we have to use what we have today.  I will 
> run again with SMT off to see what we get.

On the other hand, without SMT you will get to overcommit much faster, 
so you'll have scheduling artifacts.  Unfortunately there's no good 
answer here (except to improve the SMT scheduler).

>> Yes, it is.  If there is a lot of I/O, this might be due to the 
>> thread pool used for I/O.
> I have a older patch which makes a small change to posix_aio_thread.c 
> by trying to keep the thread pool size a bit lower than it is today.  
> I will dust that off and see if it helps.

Really, I think linux-aio support can help here.

>>
>> Yes, there is a scheduler tracer, though I have no idea how to 
>> operate it.
>>
>> Do you have kvm_stat logs?
> Sorry, I don't, but I'll run that next time.  BTW, I did not notice a 
> batch/log mode the last time I ram kvm_stat.  Or maybe it was not 
> obvious to me.  Is there an ideal way to run kvm_stat without a curses 
> like output?

You're probably using an ancient version:

$ kvm_stat --help
Usage: kvm_stat [options]

Options:
  -h, --help            show this help message and exit
  -1, --once, --batch   run in batch mode for one second
  -l, --log             run in logging mode (like vmstat)
  -f FIELDS, --fields=FIELDS
                        fields to display (regex)


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:02     ` Avi Kivity
@ 2009-04-30 13:44       ` Andrew Theurer
  2009-04-30 13:47         ` Anthony Liguori
  2009-04-30 13:52         ` Avi Kivity
  0 siblings, 2 replies; 21+ messages in thread
From: Andrew Theurer @ 2009-04-30 13:44 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel

Avi Kivity wrote:
> Andrew Theurer wrote:
>> Avi Kivity wrote:
>>>>
>>>
>>> What's the typical I/O load (disk and network bandwidth) while the 
>>> tests are running?
>> This is average thrgoughput:
>> network:    Tx: 79 MB/sec  Rx: 5 MB/sec
>
> MB as in Byte or Mb as in bit?
Byte.  There are 4 x 1 Gb adapters, each handling about 20 MB/sec or 160 
Mbit/sec.
>
>> disk:    read: 17 MB/sec  write: 40 MB/sec
>
> This could definitely cause the extra load, especially if it's many 
> small requests (compared to a few large ones).
I don't have the request sizes at my fingertips, but we have to use a 
lot of disks to support this I/O, so I think it's safe to assume there 
are a lot more requests than a simple large sequential read/write.
>
>>>> The host hardware:
>>>> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of 
>>>> disks, 4 x
>>>> 1 GB Ethenret
>>>
>>> CPU time measurements with SMT can vary wildly if the system is not 
>>> fully loaded.  If the scheduler happens to schedule two threads on a 
>>> single core, both of these threads will generate less work compared 
>>> to if they were scheduled on different cores.
>> Understood.  Even if at low loads, the scheduler does the right thing 
>> and spreads out to all the cores first, once it goes beyond 50% util, 
>> the CPU util can climb at a much higher rate (compared to a linear 
>> increase in work) because it then starts scheduling 2 threads per 
>> core, and each thread can do less work.  I have always wanted 
>> something which could more accurately show the utilization of a 
>> processor core, but I guess we have to use what we have today.  I 
>> will run again with SMT off to see what we get.
>
> On the other hand, without SMT you will get to overcommit much faster, 
> so you'll have scheduling artifacts.  Unfortunately there's no good 
> answer here (except to improve the SMT scheduler).
>
>>> Yes, it is.  If there is a lot of I/O, this might be due to the 
>>> thread pool used for I/O.
>> I have a older patch which makes a small change to posix_aio_thread.c 
>> by trying to keep the thread pool size a bit lower than it is today.  
>> I will dust that off and see if it helps.
>
> Really, I think linux-aio support can help here.
Yes, I think that would work for real block devices, but would that help 
for files?  I am using real block devices right now, but it would be 
nice to also see a benefit for files in a file-system.  Or maybe I am 
mis-understanding this, and linux-aio can be used on files?

-Andrew

>
>>>
>>> Yes, there is a scheduler tracer, though I have no idea how to 
>>> operate it.
>>>
>>> Do you have kvm_stat logs?
>> Sorry, I don't, but I'll run that next time.  BTW, I did not notice a 
>> batch/log mode the last time I ram kvm_stat.  Or maybe it was not 
>> obvious to me.  Is there an ideal way to run kvm_stat without a 
>> curses like output?
>
> You're probably using an ancient version:
>
> $ kvm_stat --help
> Usage: kvm_stat [options]
>
> Options:
>  -h, --help            show this help message and exit
>  -1, --once, --batch   run in batch mode for one second
>  -l, --log             run in logging mode (like vmstat)
>  -f FIELDS, --fields=FIELDS
>                        fields to display (regex)
>
>



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30  8:56 ` Avi Kivity
  2009-04-30 12:49   ` Andrew Theurer
@ 2009-04-30 13:45   ` Anthony Liguori
  2009-04-30 13:53     ` Avi Kivity
  2009-04-30 13:59     ` Avi Kivity
  2009-04-30 16:41   ` Marcelo Tosatti
  2 siblings, 2 replies; 21+ messages in thread
From: Anthony Liguori @ 2009-04-30 13:45 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Andrew Theurer, kvm-devel

Avi Kivity wrote:
>>
>> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
>> Does that seems a bit excessive?
>
> Yes, it is.  If there is a lot of I/O, this might be due to the thread 
> pool used for I/O.

This is why I wrote the linux-aio patch.  It only reduced CPU 
consumption by about 2% although I'm not sure if that's absolute or 
relative.  Andrew?

>> 2) cpu_physical_memory_rw due to not using preadv/pwritev?
>
> I think both virtio-net and virtio-blk use memcpy().

With latest linux-2.6, and a development snapshot of glibc, virtio-blk 
will not use memcpy() anymore but virtio-net still does on the receive 
path (but not transmit).

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:44       ` Andrew Theurer
@ 2009-04-30 13:47         ` Anthony Liguori
  2009-04-30 13:52         ` Avi Kivity
  1 sibling, 0 replies; 21+ messages in thread
From: Anthony Liguori @ 2009-04-30 13:47 UTC (permalink / raw)
  To: Andrew Theurer; +Cc: Avi Kivity, kvm-devel

Andrew Theurer wrote:
>>
>> Really, I think linux-aio support can help here.
> Yes, I think that would work for real block devices, but would that 
> help for files?  I am using real block devices right now, but it would 
> be nice to also see a benefit for files in a file-system.  Or maybe I 
> am mis-understanding this, and linux-aio can be used on files?

For cache=off, with some file systems, yes.  But not for 
cache=writethrough/writeback.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:44       ` Andrew Theurer
  2009-04-30 13:47         ` Anthony Liguori
@ 2009-04-30 13:52         ` Avi Kivity
  1 sibling, 0 replies; 21+ messages in thread
From: Avi Kivity @ 2009-04-30 13:52 UTC (permalink / raw)
  To: Andrew Theurer; +Cc: kvm-devel

Andrew Theurer wrote:
>
>>
>>> disk:    read: 17 MB/sec  write: 40 MB/sec
>>
>> This could definitely cause the extra load, especially if it's many 
>> small requests (compared to a few large ones).
> I don't have the request sizes at my fingertips, but we have to use a 
> lot of disks to support this I/O, so I think it's safe to assume there 
> are a lot more requests than a simple large sequential read/write.

Yes.  Well the high context switch rate is the scheduler's way of 
telling us to use linux-aio.  If "lot's of disks" == 100, with a 3ms 
seek time, that's already 60,000 cs/sec.

>> Really, I think linux-aio support can help here.
> Yes, I think that would work for real block devices, but would that 
> help for files?  I am using real block devices right now, but it would 
> be nice to also see a benefit for files in a file-system.  Or maybe I 
> am mis-understanding this, and linux-aio can be used on files?

It could work with files with cache=none (though not qcow2 as now written).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:45   ` Anthony Liguori
@ 2009-04-30 13:53     ` Avi Kivity
  2009-04-30 15:08       ` Anthony Liguori
  2009-04-30 13:59     ` Avi Kivity
  1 sibling, 1 reply; 21+ messages in thread
From: Avi Kivity @ 2009-04-30 13:53 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Andrew Theurer, kvm-devel

Anthony Liguori wrote:
>
>>> 2) cpu_physical_memory_rw due to not using preadv/pwritev?
>>
>> I think both virtio-net and virtio-blk use memcpy().
>
> With latest linux-2.6, and a development snapshot of glibc, virtio-blk 
> will not use memcpy() anymore but virtio-net still does on the receive 
> path (but not transmit).

There's still the kernel/user copy, so we have two copies on rx, one on tx.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:45   ` Anthony Liguori
  2009-04-30 13:53     ` Avi Kivity
@ 2009-04-30 13:59     ` Avi Kivity
  2009-04-30 14:04       ` Andrew Theurer
  2009-04-30 15:09       ` Anthony Liguori
  1 sibling, 2 replies; 21+ messages in thread
From: Avi Kivity @ 2009-04-30 13:59 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Andrew Theurer, kvm-devel

Anthony Liguori wrote:
> Avi Kivity wrote:
>>>
>>> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
>>> Does that seems a bit excessive?
>>
>> Yes, it is.  If there is a lot of I/O, this might be due to the 
>> thread pool used for I/O.
>
> This is why I wrote the linux-aio patch.  It only reduced CPU 
> consumption by about 2% although I'm not sure if that's absolute or 
> relative.  Andrew?

Was that before or after the entire path was made copyless?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:59     ` Avi Kivity
@ 2009-04-30 14:04       ` Andrew Theurer
  2009-04-30 15:11         ` Anthony Liguori
  2009-04-30 15:09       ` Anthony Liguori
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Theurer @ 2009-04-30 14:04 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Anthony Liguori, kvm-devel

Avi Kivity wrote:
> Anthony Liguori wrote:
>> Avi Kivity wrote:
>>>>
>>>> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
>>>> Does that seems a bit excessive?
>>>
>>> Yes, it is.  If there is a lot of I/O, this might be due to the 
>>> thread pool used for I/O.
>>
>> This is why I wrote the linux-aio patch.  It only reduced CPU 
>> consumption by about 2% although I'm not sure if that's absolute or 
>> relative.  Andrew?
If  I recall correctly, it was 2.4% and relative.  But with 2.3% in 
scheduler functions, that's what I expected.
>
> Was that before or after the entire path was made copyless?
If this is referring to the preadv/writev support, no, I have not tested 
with that.

-Andrew



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:53     ` Avi Kivity
@ 2009-04-30 15:08       ` Anthony Liguori
  0 siblings, 0 replies; 21+ messages in thread
From: Anthony Liguori @ 2009-04-30 15:08 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Andrew Theurer, kvm-devel

Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>>>> 2) cpu_physical_memory_rw due to not using preadv/pwritev?
>>>
>>> I think both virtio-net and virtio-blk use memcpy().
>>
>> With latest linux-2.6, and a development snapshot of glibc, 
>> virtio-blk will not use memcpy() anymore but virtio-net still does on 
>> the receive path (but not transmit).
>
> There's still the kernel/user copy, so we have two copies on rx, one 
> on tx.

That won't show up as cpu_physical_memory_rw.  stl_phys/ldl_phys are 
suspect though as they degrade to cpu_physical_memory_rw.

Regards,

Anthony Liguori


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 13:59     ` Avi Kivity
  2009-04-30 14:04       ` Andrew Theurer
@ 2009-04-30 15:09       ` Anthony Liguori
  1 sibling, 0 replies; 21+ messages in thread
From: Anthony Liguori @ 2009-04-30 15:09 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Andrew Theurer, kvm-devel

Avi Kivity wrote:
> Anthony Liguori wrote:
>> Avi Kivity wrote:
>>>>
>>>> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
>>>> Does that seems a bit excessive?
>>>
>>> Yes, it is.  If there is a lot of I/O, this might be due to the 
>>> thread pool used for I/O.
>>
>> This is why I wrote the linux-aio patch.  It only reduced CPU 
>> consumption by about 2% although I'm not sure if that's absolute or 
>> relative.  Andrew?
>
> Was that before or after the entire path was made copyless?

Before so it's worth updating and trying again.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 14:04       ` Andrew Theurer
@ 2009-04-30 15:11         ` Anthony Liguori
  2009-04-30 15:19           ` Avi Kivity
  0 siblings, 1 reply; 21+ messages in thread
From: Anthony Liguori @ 2009-04-30 15:11 UTC (permalink / raw)
  To: Andrew Theurer; +Cc: Avi Kivity, kvm-devel

Andrew Theurer wrote:
> Avi Kivity wrote:
>> Anthony Liguori wrote:
>>> Avi Kivity wrote:
>>>>>
>>>>> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
>>>>> Does that seems a bit excessive?
>>>>
>>>> Yes, it is.  If there is a lot of I/O, this might be due to the 
>>>> thread pool used for I/O.
>>>
>>> This is why I wrote the linux-aio patch.  It only reduced CPU 
>>> consumption by about 2% although I'm not sure if that's absolute or 
>>> relative.  Andrew?
> If  I recall correctly, it was 2.4% and relative.  But with 2.3% in 
> scheduler functions, that's what I expected.
>>
>> Was that before or after the entire path was made copyless?
> If this is referring to the preadv/writev support, no, I have not 
> tested with that.

Previously, the block API only exposed non-vector interfaces and bounced 
vectored operations to a linear buffer.  That's been eliminated now 
though so we need to update the linux-aio patch to implement a vectored 
backend interface.

However, it is an apples to apples comparison in terms of copying since 
the same is true with the thread pool.  My take away was that the thread 
pool overhead isn't the major source of issues.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 15:11         ` Anthony Liguori
@ 2009-04-30 15:19           ` Avi Kivity
  2009-04-30 15:59             ` Anthony Liguori
  2009-05-01  0:40             ` Andrew Theurer
  0 siblings, 2 replies; 21+ messages in thread
From: Avi Kivity @ 2009-04-30 15:19 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Andrew Theurer, kvm-devel

Anthony Liguori wrote:
>
> Previously, the block API only exposed non-vector interfaces and 
> bounced vectored operations to a linear buffer.  That's been 
> eliminated now though so we need to update the linux-aio patch to 
> implement a vectored backend interface.
>
> However, it is an apples to apples comparison in terms of copying 
> since the same is true with the thread pool.  My take away was that 
> the thread pool overhead isn't the major source of issues.

If the overhead is dominated by copying, then you won't see the 
difference.  Once the copying is eliminated, the comparison may yield 
different results.  We should certainly see a difference in context 
switches.

One cause of context switches won't be eliminated - the non-saturating 
workload causes us to switch to the idle thread, which incurs a 
heavyweight exit.  This doesn't matter since we're idle anyway, but when 
we switch back, we incur a heavyweight entry.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 15:19           ` Avi Kivity
@ 2009-04-30 15:59             ` Anthony Liguori
  2009-05-01  0:40             ` Andrew Theurer
  1 sibling, 0 replies; 21+ messages in thread
From: Anthony Liguori @ 2009-04-30 15:59 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Andrew Theurer, kvm-devel

Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>> Previously, the block API only exposed non-vector interfaces and 
>> bounced vectored operations to a linear buffer.  That's been 
>> eliminated now though so we need to update the linux-aio patch to 
>> implement a vectored backend interface.
>>
>> However, it is an apples to apples comparison in terms of copying 
>> since the same is true with the thread pool.  My take away was that 
>> the thread pool overhead isn't the major source of issues.
>
> If the overhead is dominated by copying, then you won't see the 
> difference.  Once the copying is eliminated, the comparison may yield 
> different results.  We should certainly see a difference in context 
> switches.

Yes, I agree with this.  The absence of copying (in both the thread pool 
and linux-aio) could yield significantly different results.

Regards,

Anthony Liguori

> One cause of context switches won't be eliminated - the non-saturating 
> workload causes us to switch to the idle thread, which incurs a 
> heavyweight exit.  This doesn't matter since we're idle anyway, but 
> when we switch back, we incur a heavyweight entry.
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30  8:56 ` Avi Kivity
  2009-04-30 12:49   ` Andrew Theurer
  2009-04-30 13:45   ` Anthony Liguori
@ 2009-04-30 16:41   ` Marcelo Tosatti
  2 siblings, 0 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2009-04-30 16:41 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Andrew Theurer, kvm-devel

On Thu, Apr 30, 2009 at 11:56:14AM +0300, Avi Kivity wrote:
> Andrew Theurer wrote:
>> Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
>> overhead for virtualization.  I certainly don't expect it to be 0, but
>> 55% seems a bit high.  So, what's the reason for this overhead?  At the
>> bottom is oprofile output of top functions for KVM.  Some observations:
>>
>> 1) I'm seeing about 2.3% in scheduler functions [that I recognize].
>> Does that seems a bit excessive?
>
> Yes, it is.  If there is a lot of I/O, this might be due to the thread  
> pool used for I/O.
>
>> 2) cpu_physical_memory_rw due to not using preadv/pwritev?
>
> I think both virtio-net and virtio-blk use memcpy().
>
>> 3) vmx_[save|load]_host_state: I take it this is from guest switches?
>
> These are called when you context-switch from a guest, and, much more  
> frequently, when you enter qemu.
>
>> We have 180,000 context switches a second.  Is this more than expected?
>
>
> Way more.  Across 16 logical cpus, this is >10,000 cs/sec/cpu.
>
>> I wonder if schedstats can show why we context switch (need to let
>> someone else run, yielded, waiting on io, etc).
>>
>
> Yes, there is a scheduler tracer, though I have no idea how to operate it.
>
> Do you have kvm_stat logs?

In case the kvm_stat logs don't shed enough light, this should help.

Documentation/trace/ftrace.txt:

sched_switch
------------

This tracer simply records schedule switches. Here is an example
of how to use it.

 # echo sched_switch > /debug/tracing/current_tracer
 # echo 1 > /debug/tracing/tracing_enabled
 # sleep 1
 # echo 0 > /debug/tracing/tracing_enabled
 # cat /debug/tracing/trace




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-04-30 15:19           ` Avi Kivity
  2009-04-30 15:59             ` Anthony Liguori
@ 2009-05-01  0:40             ` Andrew Theurer
  2009-05-03 16:20               ` Avi Kivity
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Theurer @ 2009-05-01  0:40 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Anthony Liguori, kvm-devel

Here are the SMT off results.  This workload is designed to not 
over-saturate the CPU, so you have to pick a number of server sets to 
ensure that.  With SMT on, 4 sets was enough for KVM, but 5 was too much 
(start seeing response time errors).  For SMT off, I tried to size the 
load as high as we can go without running into these errors.  For KVM, 
thats 3 (18 guests) and for Xen, that's 4 (24 guests).  The throughout 
has a fairly linear relationship to the number of server sets used, but 
has a bit of wiggle room (mostly affected by response times getting 
longer and longer, but not exceeding the requirement set forth).  
Anyway, the relative throughput for these are "1.0" for KVM and "1.34" 
for Xen.  The CPU is 78.71% for KVM the CPU is 87.83%. 

If we normalize to CPU utilization, Xen is doing 20% more throughput.

Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>> Previously, the block API only exposed non-vector interfaces and 
>> bounced vectored operations to a linear buffer.  That's been 
>> eliminated now though so we need to update the linux-aio patch to 
>> implement a vectored backend interface.
>>
>> However, it is an apples to apples comparison in terms of copying 
>> since the same is true with the thread pool.  My take away was that 
>> the thread pool overhead isn't the major source of issues.
>
> If the overhead is dominated by copying, then you won't see the 
> difference.  Once the copying is eliminated, the comparison may yield 
> different results.  We should certainly see a difference in context 
> switches.
I would like to test this the proper way.  What do I need to do to 
ensure these copies are eliminated?  I am on a 2.6.27 kernel, am I 
missing anything there?  Anthony, would you be willing to provide a 
patch to support the changes in the block API?
>
> One cause of context switches won't be eliminated - the non-saturating 
> workload causes us to switch to the idle thread, which incurs a 
> heavyweight exit.  This doesn't matter since we're idle anyway, but 
> when we switch back, we incur a heavyweight entry.
I have not looked at the schedstat or ftrace yet, but will soon.  Maybe 
it will tell us a little more about the context switches.

Here's a sample of the kvm_stat:

 efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall  insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_injec  irq_windo  kvm_reque  largepage  mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc  mmu_shado  mmu_unsyn  mmu_unsyn  nmi_injec  nmi_windo   pf_fixed   pf_guest  remote_tl  request_n  signal_ex  tlb_flush
         0     233866      53994      20353      16209     119812          0      48879          0          0      75666      44917      34772       3984          0        187          0         10          0          0          0          0          0          0          0          0          0          0        202          0          0          0          0      17698
         0     244556      67321      15570      12364     116226          0      49865          0          0      69357      56131      32860       4449          0      -1895          0         19          0          0          0          0         21         21          0          0          0          0       1117          0          0          0          0      21586
         0     230788      71382      10619       7920     109151          0      44354          0          0      62561      60074      28322       4841          0        103          0         13          0          0          0          0          0          0          0          0          0          0        122          0          0          0          0      22702
         0     275259      82605      14326      11148     127293          0      53738          0          0      73438      70707      34724       5373          0        859          0         15          0          0          0          0         21         21          0          0          0          0        874          0          0          0          0      26723
         0     250576      58760      20368      16476     128296          0      50936          0          0      80439      51219      36329       4621          0      -1170          0          8          0          0          0          0         22         22          0          0          0          0       1333          0          0          0          0      18508
         0     244746      59650      19480      15657     122721          0      49882          0          0      76011      50453      35352       4523          0        201          0         11          0          0          0          0         21         21          0          0          0          0        212          0          0          0          0      19163
         0     251724      71715      14049      10920     117255          0      49924          0          0      70173      58040      32328       5058          0         94          0          7          0          0          0          0          0          0          0          0          0          0        105          0          0          0          0      25405
         0     247873      75212      12397       9465     117299          0      47402          0          0      68435      62901      30999       5289          0         36          0          9          0          0          0          0          0          0          0          0          0          0         47          0          0          0          0      24400
         0     259105      79515      14060      10713     121489          0      50106          0          0      71847      62392      33165       4802          0        358          0         17          0          0          0          0          0          0          0          0          0          0        375          0          0          0          0      26420
         0     255283      74818      13847      10642     120632          0      48832          0          0      70851      65453      32520       5032          0        752          0          6          0          0          0          0          0          0          0          0          0          0        759          0          0          0          0      23764
         0     268411      78048      15231      11707     123642          0      52845          0          0      74031      64919      34404       4765          0        639          0         15          0          0          0          0          0          0          0          0          0          0        653          0          0          0          0      25992
         0     247064      73794      12554       9522     115026          0      47878          0          0      66357      64359      30727       4884          0         97          0          8          0          0          0          0          0          0          0          0          0          0        107          0          0          0          0      23545
         0     259641      79179      11953       9247     117090          0      49836          0          0      68561      67053      31171       5435          0      -2759          0         11          0          0          0          0         21         21          0          0          0          0        245          0          0          0          0      26858
         0     258109      77455      13997      10732     121578          0      50559          0          0      71833      63841      33404       4980          0        484          0         14          0          0          0          0         21         21          0          0          0          0        495          0          0          0          0      24509
         0     250245      74357      13611      10459     117791          0      49733          0          0      68471      65089      31943       4797          0        581          0         13          0          0          0          0          0          0          0          0          0          0        596          0          0          0          0      22517
         0     262114      77257      13614      10499     121082          0      50683          0          0      71242      67844      33234       5031          0       1125          0          8          0          0          0          0          0          0          0          0          0          0       1133          0          0          0          0      24370
         0     254914      75937      12784       9809     116020          0      50562          0          0      67452      62673      31249       4903          0        786          0         19          0          0          0          0          0          0          0          0          0          0        806          0          0          0          0      25931
         0     249421      75642      12704       9805     116039          0      48426          0          0      66972      62276      31068       4999          0      -1817          0          6          0          0          0          0         21         21          0          0          0          0        187          0          0          0          0      25169
         0     274205      79561      13992      10844     126452          0      53165          0          0      74522      68844      34131       5529          0        123          0         20          0          0          0          0         42         42          0          0          0          0        152          0          0          0          0      26633
         0     267310      77262      15092      11705     125139          0      52891          0          0      74651      64647      34938       5018          0        195          0         18          0          0          0          0          0          0          0          0          0          0        213          0          0          0          0      25161
 

-Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: KVM performance vs. Xen
  2009-05-01  0:40             ` Andrew Theurer
@ 2009-05-03 16:20               ` Avi Kivity
  0 siblings, 0 replies; 21+ messages in thread
From: Avi Kivity @ 2009-05-03 16:20 UTC (permalink / raw)
  To: Andrew Theurer; +Cc: Anthony Liguori, kvm-devel

Andrew Theurer wrote:
>>
>> If the overhead is dominated by copying, then you won't see the 
>> difference.  Once the copying is eliminated, the comparison may yield 
>> different results.  We should certainly see a difference in context 
>> switches.
> I would like to test this the proper way.  What do I need to do to 
> ensure these copies are eliminated?  I am on a 2.6.27 kernel, am I 
> missing anything there?  Anthony, would you be willing to provide a 
> patch to support the changes in the block API?

You need a 2.6.30 host kernel plus a libc patch.  Or the linux-aio qemu 
patch.

>>
>> One cause of context switches won't be eliminated - the 
>> non-saturating workload causes us to switch to the idle thread, which 
>> incurs a heavyweight exit.  This doesn't matter since we're idle 
>> anyway, but when we switch back, we incur a heavyweight entry.
> I have not looked at the schedstat or ftrace yet, but will soon.  
> Maybe it will tell us a little more about the context switches.
>
> Here's a sample of the kvm_stat:

We have about 120K host_state_reloads/sec, 70K pio/sec, and 35K 
interrupts/sec.

That corresponds to 35K virtio notifications/sec (reasonable for 8 
cores), and 85K excess context switches/sec.  These can probably be 
eliminated by using linux-aio, except those due to idling.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-05-03 16:20 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-29 14:41 KVM performance vs. Xen Andrew Theurer
2009-04-29 15:20 ` Nakajima, Jun
2009-04-29 15:33   ` Andrew Theurer
2009-04-30  8:56 ` Avi Kivity
2009-04-30 12:49   ` Andrew Theurer
2009-04-30 13:02     ` Avi Kivity
2009-04-30 13:44       ` Andrew Theurer
2009-04-30 13:47         ` Anthony Liguori
2009-04-30 13:52         ` Avi Kivity
2009-04-30 13:45   ` Anthony Liguori
2009-04-30 13:53     ` Avi Kivity
2009-04-30 15:08       ` Anthony Liguori
2009-04-30 13:59     ` Avi Kivity
2009-04-30 14:04       ` Andrew Theurer
2009-04-30 15:11         ` Anthony Liguori
2009-04-30 15:19           ` Avi Kivity
2009-04-30 15:59             ` Anthony Liguori
2009-05-01  0:40             ` Andrew Theurer
2009-05-03 16:20               ` Avi Kivity
2009-04-30 15:09       ` Anthony Liguori
2009-04-30 16:41   ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.