All of lore.kernel.org
 help / color / mirror / Atom feed
* virtio-blk performance regression and qemu-kvm
@ 2012-02-10 14:36 ` Dongsu Park
  0 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-10 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm

Hi,

Recently I observed performance regression regarding virtio-blk,
especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
So I want to share the benchmark results, and ask you what the reason
would be.

1. Test condition

 - On host, ramdisk-backed block device (/dev/ram0)
 - qemu-kvm is configured with virtio-blk driver for /dev/ram0,
   which is detected as /dev/vdb inside the guest VM.
 - Host System: Ubuntu 11.10 / Kernel 3.2
 - Guest System: Debian 6.0 / Kernel 3.0.6
 - Host I/O scheduler : deadline
 - testing tool : fio

2. Raw performance on the host

 If we test I/O with fio on /dev/ram0 on the host,

 - Sequential read (on the host)
  # fio -name iops -rw=read -size=1G -iodepth 1 \
   -filename /dev/ram0 -ioengine libaio -direct=1 -bs=4096

 - Sequential write (on the host)
  # fio -name iops -rw=write -size=1G -iodepth 1 \
   -filename /dev/ram0 -ioengine libaio -direct=1 -bs=4096

 Result:

  read   1691,6 MByte/s
  write   898,9 MByte/s

 No wonder, it's extremely fast.

3. Comparison with different qemu-kvm versions

 Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.

 - Sequential read (Running inside guest)
   # fio -name iops -rw=read -size=1G -iodepth 1 \
    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096

 - Sequential write (Running inside guest)
   # fio -name iops -rw=write -size=1G -iodepth 1 \
    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096

 For each one, I tested 3 times to get the average.

 Result:

  seqread with qemu-kvm 0.14.1   67,0 MByte/s
  seqread with qemu-kvm 1.0      30,9 MByte/s

  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
  seqwrite with qemu-kvm 1.0     30,5 MByte/s

 So the newest stable version of qemu-kvm shows only the half of
 bandwidth compared to the older version 0.14.1.

The question is, why is it so slower?
How can we improve the performance, except for downgrading to 0.14.1?

I know there have been already several discussions on this issue,
for example, benchmark and trace on virtio-blk latency [1],
or in-kernel accelerator "vhost-blk" [2].
I'm going to continue testing with those ones, too.
But does anyone have a better idea or know about recent updates?

Regards,
Dongsu

[1] http://www.linux-kvm.org/page/Virtio/Block/Latency
[2] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/76893


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-10 14:36 ` Dongsu Park
  0 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-10 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm

Hi,

Recently I observed performance regression regarding virtio-blk,
especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
So I want to share the benchmark results, and ask you what the reason
would be.

1. Test condition

 - On host, ramdisk-backed block device (/dev/ram0)
 - qemu-kvm is configured with virtio-blk driver for /dev/ram0,
   which is detected as /dev/vdb inside the guest VM.
 - Host System: Ubuntu 11.10 / Kernel 3.2
 - Guest System: Debian 6.0 / Kernel 3.0.6
 - Host I/O scheduler : deadline
 - testing tool : fio

2. Raw performance on the host

 If we test I/O with fio on /dev/ram0 on the host,

 - Sequential read (on the host)
  # fio -name iops -rw=read -size=1G -iodepth 1 \
   -filename /dev/ram0 -ioengine libaio -direct=1 -bs=4096

 - Sequential write (on the host)
  # fio -name iops -rw=write -size=1G -iodepth 1 \
   -filename /dev/ram0 -ioengine libaio -direct=1 -bs=4096

 Result:

  read   1691,6 MByte/s
  write   898,9 MByte/s

 No wonder, it's extremely fast.

3. Comparison with different qemu-kvm versions

 Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.

 - Sequential read (Running inside guest)
   # fio -name iops -rw=read -size=1G -iodepth 1 \
    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096

 - Sequential write (Running inside guest)
   # fio -name iops -rw=write -size=1G -iodepth 1 \
    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096

 For each one, I tested 3 times to get the average.

 Result:

  seqread with qemu-kvm 0.14.1   67,0 MByte/s
  seqread with qemu-kvm 1.0      30,9 MByte/s

  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
  seqwrite with qemu-kvm 1.0     30,5 MByte/s

 So the newest stable version of qemu-kvm shows only the half of
 bandwidth compared to the older version 0.14.1.

The question is, why is it so slower?
How can we improve the performance, except for downgrading to 0.14.1?

I know there have been already several discussions on this issue,
for example, benchmark and trace on virtio-blk latency [1],
or in-kernel accelerator "vhost-blk" [2].
I'm going to continue testing with those ones, too.
But does anyone have a better idea or know about recent updates?

Regards,
Dongsu

[1] http://www.linux-kvm.org/page/Virtio/Block/Latency
[2] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/76893

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-10 14:36 ` [Qemu-devel] " Dongsu Park
@ 2012-02-12 23:55   ` Rusty Russell
  -1 siblings, 0 replies; 52+ messages in thread
From: Rusty Russell @ 2012-02-12 23:55 UTC (permalink / raw)
  To: Dongsu Park, qemu-devel; +Cc: kvm

On Fri, 10 Feb 2012 15:36:39 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> Hi,
> 
> Recently I observed performance regression regarding virtio-blk,
> especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> So I want to share the benchmark results, and ask you what the reason
> would be.

Interesting.  There are two obvious possibilities here.  One is that
qemu has regressed, the other is that virtio_blk has regressed; the new
qemu may negotiate new features.  Please do the following in the guest
with old and new qemus:

cat /sys/class/block/vdb/device/features

(eg, here that gives: 0010101101100000000000000000100e0).

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-12 23:55   ` Rusty Russell
  0 siblings, 0 replies; 52+ messages in thread
From: Rusty Russell @ 2012-02-12 23:55 UTC (permalink / raw)
  To: Dongsu Park, qemu-devel; +Cc: kvm

On Fri, 10 Feb 2012 15:36:39 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> Hi,
> 
> Recently I observed performance regression regarding virtio-blk,
> especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> So I want to share the benchmark results, and ask you what the reason
> would be.

Interesting.  There are two obvious possibilities here.  One is that
qemu has regressed, the other is that virtio_blk has regressed; the new
qemu may negotiate new features.  Please do the following in the guest
with old and new qemus:

cat /sys/class/block/vdb/device/features

(eg, here that gives: 0010101101100000000000000000100e0).

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-10 14:36 ` [Qemu-devel] " Dongsu Park
@ 2012-02-13 11:57   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-13 11:57 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Fri, Feb 10, 2012 at 2:36 PM, Dongsu Park
<dongsu.park@profitbricks.com> wrote:
>  Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.
>
>  - Sequential read (Running inside guest)
>   # fio -name iops -rw=read -size=1G -iodepth 1 \
>    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>
>  - Sequential write (Running inside guest)
>   # fio -name iops -rw=write -size=1G -iodepth 1 \
>    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>
>  For each one, I tested 3 times to get the average.
>
>  Result:
>
>  seqread with qemu-kvm 0.14.1   67,0 MByte/s
>  seqread with qemu-kvm 1.0      30,9 MByte/s
>
>  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
>  seqwrite with qemu-kvm 1.0     30,5 MByte/s

Please retry with the following commit or simply qemu-kvm.git/master.
Avi discovered a performance regression which was introduced when the
block layer was converted to use coroutines:

$ git describe 39a7a362e16bb27e98738d63f24d1ab5811e26a8
v1.0-327-g39a7a36

(This commit is not in 1.0!)

Please post your qemu-kvm command-line.

67 MB/s sequential 4 KB read means 67 * 1024 / 4 = 17152 requests per
second, so 58 microseconds per request.

Please post the fio output so we can double-check what is reported.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-13 11:57   ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-13 11:57 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Fri, Feb 10, 2012 at 2:36 PM, Dongsu Park
<dongsu.park@profitbricks.com> wrote:
>  Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.
>
>  - Sequential read (Running inside guest)
>   # fio -name iops -rw=read -size=1G -iodepth 1 \
>    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>
>  - Sequential write (Running inside guest)
>   # fio -name iops -rw=write -size=1G -iodepth 1 \
>    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>
>  For each one, I tested 3 times to get the average.
>
>  Result:
>
>  seqread with qemu-kvm 0.14.1   67,0 MByte/s
>  seqread with qemu-kvm 1.0      30,9 MByte/s
>
>  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
>  seqwrite with qemu-kvm 1.0     30,5 MByte/s

Please retry with the following commit or simply qemu-kvm.git/master.
Avi discovered a performance regression which was introduced when the
block layer was converted to use coroutines:

$ git describe 39a7a362e16bb27e98738d63f24d1ab5811e26a8
v1.0-327-g39a7a36

(This commit is not in 1.0!)

Please post your qemu-kvm command-line.

67 MB/s sequential 4 KB read means 67 * 1024 / 4 = 17152 requests per
second, so 58 microseconds per request.

Please post the fio output so we can double-check what is reported.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-13 11:57   ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-02-21 15:57     ` Dongsu Park
  -1 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-21 15:57 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

Hi Stefan,
see below.

On 13.02.2012 11:57, Stefan Hajnoczi wrote:
> On Fri, Feb 10, 2012 at 2:36 PM, Dongsu Park
> <dongsu.park@profitbricks.com> wrote:
> >  Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.
> >
> >  - Sequential read (Running inside guest)
> >   # fio -name iops -rw=read -size=1G -iodepth 1 \
> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
> >
> >  - Sequential write (Running inside guest)
> >   # fio -name iops -rw=write -size=1G -iodepth 1 \
> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
> >
> >  For each one, I tested 3 times to get the average.
> >
> >  Result:
> >
> >  seqread with qemu-kvm 0.14.1   67,0 MByte/s
> >  seqread with qemu-kvm 1.0      30,9 MByte/s
> >
> >  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
> >  seqwrite with qemu-kvm 1.0     30,5 MByte/s
> 
> Please retry with the following commit or simply qemu-kvm.git/master.
> Avi discovered a performance regression which was introduced when the
> block layer was converted to use coroutines:
> 
> $ git describe 39a7a362e16bb27e98738d63f24d1ab5811e26a8
> v1.0-327-g39a7a36
> 
> (This commit is not in 1.0!)
> 
> Please post your qemu-kvm command-line.
> 
> 67 MB/s sequential 4 KB read means 67 * 1024 / 4 = 17152 requests per
> second, so 58 microseconds per request.
> 
> Please post the fio output so we can double-check what is reported.

As you mentioned above, I tested it again with the revision
v1.0-327-g39a7a36, which includes the commit 39a7a36.

Result is though still not good enough.

seqread   : 20.3 MByte/s
seqwrite  : 20.1 MByte/s
randread  : 20.5 MByte/s
randwrite : 20.0 MByte/s

My qemu-kvm commandline is like below:

=======================================================================
/usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 1024 \
-smp 1,sockets=1,cores=1,threads=1 -name mydebian3_8gb \
-uuid d99ad012-2fcc-6f7e-fbb9-bc48b424a258 -nodefconfig -nodefaults \
-chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/mydebian3_8gb.monitor,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown \
-drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw \
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \
-drive file=/var/lib/libvirt/images/mydebian3_8gb.img,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-drive file=/dev/ram0,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 \
-netdev tap,fd=19,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:68:9f:d0,bus=pci.0,addr=0x3 \
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 \
-usb -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus \
-device AC97,id=sound0,bus=pci.0,addr=0x4 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
=======================================================================

As you see, /dev/ram0 is being mapped to /dev/vdb on the guest side,
which is used for fio tests.

Here is a sample of fio output:

=======================================================================
# fio -name iops -rw=read -size=1G -iodepth 1 -filename /dev/vdb \
-ioengine libaio -direct=1 -bs=4096
iops: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [R] [100.0% done] [21056K/0K /s] [5140/0 iops] [eta
00m:00s]
iops: (groupid=0, jobs=1): err= 0: pid=1588
  read : io=1024MB, bw=20101KB/s, iops=5025, runt= 52166msec
    slat (usec): min=4, max=6461, avg=24.00, stdev=19.75
    clat (usec): min=0, max=11934, avg=169.49, stdev=113.91
    bw (KB/s) : min=18200, max=23048, per=100.03%, avg=20106.31, stdev=934.42
  cpu          : usr=5.43%, sys=23.25%, ctx=262363, majf=0, minf=28
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=262144/0, short=0/0
     lat (usec): 2=0.01%, 4=0.16%, 10=0.03%, 20=0.01%, 50=0.27%
     lat (usec): 100=4.07%, 250=89.12%, 500=5.76%, 750=0.30%, 1000=0.13%
     lat (msec): 2=0.12%, 4=0.02%, 10=0.01%, 20=0.01%

Run status group 0 (all jobs):
   READ: io=1024MB, aggrb=20100KB/s, minb=20583KB/s, maxb=20583KB/s,
mint=52166msec, maxt=52166msec

Disk stats (read/write):
  vdb: ios=261308/0, merge=0/0, ticks=40210/0, in_queue=40110, util=77.14%
=======================================================================


So I think, the patch for coroutine-ucontext isn't about the bottleneck
I'm looking for.

Regards,
Dongsu

p.s. Sorry for the late reply. Last week I was on vacation.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-21 15:57     ` Dongsu Park
  0 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-21 15:57 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

Hi Stefan,
see below.

On 13.02.2012 11:57, Stefan Hajnoczi wrote:
> On Fri, Feb 10, 2012 at 2:36 PM, Dongsu Park
> <dongsu.park@profitbricks.com> wrote:
> >  Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.
> >
> >  - Sequential read (Running inside guest)
> >   # fio -name iops -rw=read -size=1G -iodepth 1 \
> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
> >
> >  - Sequential write (Running inside guest)
> >   # fio -name iops -rw=write -size=1G -iodepth 1 \
> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
> >
> >  For each one, I tested 3 times to get the average.
> >
> >  Result:
> >
> >  seqread with qemu-kvm 0.14.1   67,0 MByte/s
> >  seqread with qemu-kvm 1.0      30,9 MByte/s
> >
> >  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
> >  seqwrite with qemu-kvm 1.0     30,5 MByte/s
> 
> Please retry with the following commit or simply qemu-kvm.git/master.
> Avi discovered a performance regression which was introduced when the
> block layer was converted to use coroutines:
> 
> $ git describe 39a7a362e16bb27e98738d63f24d1ab5811e26a8
> v1.0-327-g39a7a36
> 
> (This commit is not in 1.0!)
> 
> Please post your qemu-kvm command-line.
> 
> 67 MB/s sequential 4 KB read means 67 * 1024 / 4 = 17152 requests per
> second, so 58 microseconds per request.
> 
> Please post the fio output so we can double-check what is reported.

As you mentioned above, I tested it again with the revision
v1.0-327-g39a7a36, which includes the commit 39a7a36.

Result is though still not good enough.

seqread   : 20.3 MByte/s
seqwrite  : 20.1 MByte/s
randread  : 20.5 MByte/s
randwrite : 20.0 MByte/s

My qemu-kvm commandline is like below:

=======================================================================
/usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 1024 \
-smp 1,sockets=1,cores=1,threads=1 -name mydebian3_8gb \
-uuid d99ad012-2fcc-6f7e-fbb9-bc48b424a258 -nodefconfig -nodefaults \
-chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/mydebian3_8gb.monitor,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown \
-drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw \
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \
-drive file=/var/lib/libvirt/images/mydebian3_8gb.img,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-drive file=/dev/ram0,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 \
-netdev tap,fd=19,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:68:9f:d0,bus=pci.0,addr=0x3 \
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 \
-usb -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus \
-device AC97,id=sound0,bus=pci.0,addr=0x4 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
=======================================================================

As you see, /dev/ram0 is being mapped to /dev/vdb on the guest side,
which is used for fio tests.

Here is a sample of fio output:

=======================================================================
# fio -name iops -rw=read -size=1G -iodepth 1 -filename /dev/vdb \
-ioengine libaio -direct=1 -bs=4096
iops: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [R] [100.0% done] [21056K/0K /s] [5140/0 iops] [eta
00m:00s]
iops: (groupid=0, jobs=1): err= 0: pid=1588
  read : io=1024MB, bw=20101KB/s, iops=5025, runt= 52166msec
    slat (usec): min=4, max=6461, avg=24.00, stdev=19.75
    clat (usec): min=0, max=11934, avg=169.49, stdev=113.91
    bw (KB/s) : min=18200, max=23048, per=100.03%, avg=20106.31, stdev=934.42
  cpu          : usr=5.43%, sys=23.25%, ctx=262363, majf=0, minf=28
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=262144/0, short=0/0
     lat (usec): 2=0.01%, 4=0.16%, 10=0.03%, 20=0.01%, 50=0.27%
     lat (usec): 100=4.07%, 250=89.12%, 500=5.76%, 750=0.30%, 1000=0.13%
     lat (msec): 2=0.12%, 4=0.02%, 10=0.01%, 20=0.01%

Run status group 0 (all jobs):
   READ: io=1024MB, aggrb=20100KB/s, minb=20583KB/s, maxb=20583KB/s,
mint=52166msec, maxt=52166msec

Disk stats (read/write):
  vdb: ios=261308/0, merge=0/0, ticks=40210/0, in_queue=40110, util=77.14%
=======================================================================


So I think, the patch for coroutine-ucontext isn't about the bottleneck
I'm looking for.

Regards,
Dongsu

p.s. Sorry for the late reply. Last week I was on vacation.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-12 23:55   ` [Qemu-devel] " Rusty Russell
@ 2012-02-21 16:45     ` Dongsu Park
  -1 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-21 16:45 UTC (permalink / raw)
  To: Rusty Russell; +Cc: qemu-devel, kvm

Hi Rusty,

On 13.02.2012 10:25, Rusty Russell wrote:
> On Fri, 10 Feb 2012 15:36:39 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> > Hi,
> > 
> > Recently I observed performance regression regarding virtio-blk,
> > especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> > So I want to share the benchmark results, and ask you what the reason
> > would be.
> 
> Interesting.  There are two obvious possibilities here.  One is that
> qemu has regressed, the other is that virtio_blk has regressed; the new
> qemu may negotiate new features.  Please do the following in the guest
> with old and new qemus:
> 
> cat /sys/class/block/vdb/device/features
> 
> (eg, here that gives: 0010101101100000000000000000100e0).

I did that on guest VM, using both qemu-kvm 0.14.1 and 1.0.
(cat /sys/class/block/vdb/device/features)

using qemu-kvm 0.14.1:

0010101101100000000000000000100000000000000000000000000000000000

using qemu-kvm 1.0:

0010101101100000000000000000110000000000000000000000000000000000

>From my understanding, both of them have the same virtio features.
Please correct me if I'm wrong.

Regards,
Dongsu

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-21 16:45     ` Dongsu Park
  0 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-21 16:45 UTC (permalink / raw)
  To: Rusty Russell; +Cc: qemu-devel, kvm

Hi Rusty,

On 13.02.2012 10:25, Rusty Russell wrote:
> On Fri, 10 Feb 2012 15:36:39 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> > Hi,
> > 
> > Recently I observed performance regression regarding virtio-blk,
> > especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> > So I want to share the benchmark results, and ask you what the reason
> > would be.
> 
> Interesting.  There are two obvious possibilities here.  One is that
> qemu has regressed, the other is that virtio_blk has regressed; the new
> qemu may negotiate new features.  Please do the following in the guest
> with old and new qemus:
> 
> cat /sys/class/block/vdb/device/features
> 
> (eg, here that gives: 0010101101100000000000000000100e0).

I did that on guest VM, using both qemu-kvm 0.14.1 and 1.0.
(cat /sys/class/block/vdb/device/features)

using qemu-kvm 0.14.1:

0010101101100000000000000000100000000000000000000000000000000000

using qemu-kvm 1.0:

0010101101100000000000000000110000000000000000000000000000000000

>From my understanding, both of them have the same virtio features.
Please correct me if I'm wrong.

Regards,
Dongsu

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-21 15:57     ` [Qemu-devel] " Dongsu Park
@ 2012-02-21 17:27       ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-21 17:27 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Tue, Feb 21, 2012 at 3:57 PM, Dongsu Park
<dongsu.park@profitbricks.com> wrote:
> On 13.02.2012 11:57, Stefan Hajnoczi wrote:
>> On Fri, Feb 10, 2012 at 2:36 PM, Dongsu Park
>> <dongsu.park@profitbricks.com> wrote:
>> >  Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.
>> >
>> >  - Sequential read (Running inside guest)
>> >   # fio -name iops -rw=read -size=1G -iodepth 1 \
>> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>> >
>> >  - Sequential write (Running inside guest)
>> >   # fio -name iops -rw=write -size=1G -iodepth 1 \
>> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>> >
>> >  For each one, I tested 3 times to get the average.
>> >
>> >  Result:
>> >
>> >  seqread with qemu-kvm 0.14.1   67,0 MByte/s
>> >  seqread with qemu-kvm 1.0      30,9 MByte/s
>> >
>> >  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
>> >  seqwrite with qemu-kvm 1.0     30,5 MByte/s
>>
>> Please retry with the following commit or simply qemu-kvm.git/master.
>> Avi discovered a performance regression which was introduced when the
>> block layer was converted to use coroutines:
>>
>> $ git describe 39a7a362e16bb27e98738d63f24d1ab5811e26a8
>> v1.0-327-g39a7a36
>>
>> (This commit is not in 1.0!)
>>
>> Please post your qemu-kvm command-line.
>>
>> 67 MB/s sequential 4 KB read means 67 * 1024 / 4 = 17152 requests per
>> second, so 58 microseconds per request.
>>
>> Please post the fio output so we can double-check what is reported.
>
> As you mentioned above, I tested it again with the revision
> v1.0-327-g39a7a36, which includes the commit 39a7a36.
>
> Result is though still not good enough.
>
> seqread   : 20.3 MByte/s
> seqwrite  : 20.1 MByte/s
> randread  : 20.5 MByte/s
> randwrite : 20.0 MByte/s
>
> My qemu-kvm commandline is like below:
>
> =======================================================================
> /usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 1024 \
> -smp 1,sockets=1,cores=1,threads=1 -name mydebian3_8gb \
> -uuid d99ad012-2fcc-6f7e-fbb9-bc48b424a258 -nodefconfig -nodefaults \
> -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/mydebian3_8gb.monitor,server,nowait \
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown \
> -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw \
> -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \
> -drive file=/var/lib/libvirt/images/mydebian3_8gb.img,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native \
> -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
> -drive file=/dev/ram0,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native \

I'm not sure if O_DIRECT and Linux AIO to /dev/ram0 is a good idea.
At least with tmpfs O_DIRECT does not even work - which kind of makes
sense there because tmpfs lives in the page cache.  My point here is
that ramdisk does not follow the same rules or have the same
performance characteristics as real disks do.  It's something to be
careful about.  Did you run this test because you noticed a real-world
regression?

> Here is a sample of fio output:
>
> =======================================================================
> # fio -name iops -rw=read -size=1G -iodepth 1 -filename /dev/vdb \
> -ioengine libaio -direct=1 -bs=4096
> iops: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [R] [100.0% done] [21056K/0K /s] [5140/0 iops] [eta
> 00m:00s]
> iops: (groupid=0, jobs=1): err= 0: pid=1588
>  read : io=1024MB, bw=20101KB/s, iops=5025, runt= 52166msec
>    slat (usec): min=4, max=6461, avg=24.00, stdev=19.75
>    clat (usec): min=0, max=11934, avg=169.49, stdev=113.91
>    bw (KB/s) : min=18200, max=23048, per=100.03%, avg=20106.31, stdev=934.42
>  cpu          : usr=5.43%, sys=23.25%, ctx=262363, majf=0, minf=28
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=262144/0, short=0/0
>     lat (usec): 2=0.01%, 4=0.16%, 10=0.03%, 20=0.01%, 50=0.27%
>     lat (usec): 100=4.07%, 250=89.12%, 500=5.76%, 750=0.30%, 1000=0.13%
>     lat (msec): 2=0.12%, 4=0.02%, 10=0.01%, 20=0.01%
>
> Run status group 0 (all jobs):
>   READ: io=1024MB, aggrb=20100KB/s, minb=20583KB/s, maxb=20583KB/s,
> mint=52166msec, maxt=52166msec
>
> Disk stats (read/write):
>  vdb: ios=261308/0, merge=0/0, ticks=40210/0, in_queue=40110, util=77.14%
> =======================================================================
>
>
> So I think, the patch for coroutine-ucontext isn't about the bottleneck
> I'm looking for.

Try turning ioeventfd off for the virtio-blk device:

-device virtio-blk-pci,ioeventfd=off,...

You might see better performance since ramdisk I/O should be very
low-latency.  The overhead of using ioeventfd might not make it
worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
it helps avoid stealing vcpu time and also causing lock contention
inside the guest - but if host I/O latency is extremely low it might
be faster to issue I/O from the vcpu thread.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-21 17:27       ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-21 17:27 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Tue, Feb 21, 2012 at 3:57 PM, Dongsu Park
<dongsu.park@profitbricks.com> wrote:
> On 13.02.2012 11:57, Stefan Hajnoczi wrote:
>> On Fri, Feb 10, 2012 at 2:36 PM, Dongsu Park
>> <dongsu.park@profitbricks.com> wrote:
>> >  Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.
>> >
>> >  - Sequential read (Running inside guest)
>> >   # fio -name iops -rw=read -size=1G -iodepth 1 \
>> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>> >
>> >  - Sequential write (Running inside guest)
>> >   # fio -name iops -rw=write -size=1G -iodepth 1 \
>> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
>> >
>> >  For each one, I tested 3 times to get the average.
>> >
>> >  Result:
>> >
>> >  seqread with qemu-kvm 0.14.1   67,0 MByte/s
>> >  seqread with qemu-kvm 1.0      30,9 MByte/s
>> >
>> >  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
>> >  seqwrite with qemu-kvm 1.0     30,5 MByte/s
>>
>> Please retry with the following commit or simply qemu-kvm.git/master.
>> Avi discovered a performance regression which was introduced when the
>> block layer was converted to use coroutines:
>>
>> $ git describe 39a7a362e16bb27e98738d63f24d1ab5811e26a8
>> v1.0-327-g39a7a36
>>
>> (This commit is not in 1.0!)
>>
>> Please post your qemu-kvm command-line.
>>
>> 67 MB/s sequential 4 KB read means 67 * 1024 / 4 = 17152 requests per
>> second, so 58 microseconds per request.
>>
>> Please post the fio output so we can double-check what is reported.
>
> As you mentioned above, I tested it again with the revision
> v1.0-327-g39a7a36, which includes the commit 39a7a36.
>
> Result is though still not good enough.
>
> seqread   : 20.3 MByte/s
> seqwrite  : 20.1 MByte/s
> randread  : 20.5 MByte/s
> randwrite : 20.0 MByte/s
>
> My qemu-kvm commandline is like below:
>
> =======================================================================
> /usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 1024 \
> -smp 1,sockets=1,cores=1,threads=1 -name mydebian3_8gb \
> -uuid d99ad012-2fcc-6f7e-fbb9-bc48b424a258 -nodefconfig -nodefaults \
> -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/mydebian3_8gb.monitor,server,nowait \
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown \
> -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw \
> -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \
> -drive file=/var/lib/libvirt/images/mydebian3_8gb.img,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native \
> -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
> -drive file=/dev/ram0,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native \

I'm not sure if O_DIRECT and Linux AIO to /dev/ram0 is a good idea.
At least with tmpfs O_DIRECT does not even work - which kind of makes
sense there because tmpfs lives in the page cache.  My point here is
that ramdisk does not follow the same rules or have the same
performance characteristics as real disks do.  It's something to be
careful about.  Did you run this test because you noticed a real-world
regression?

> Here is a sample of fio output:
>
> =======================================================================
> # fio -name iops -rw=read -size=1G -iodepth 1 -filename /dev/vdb \
> -ioengine libaio -direct=1 -bs=4096
> iops: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [R] [100.0% done] [21056K/0K /s] [5140/0 iops] [eta
> 00m:00s]
> iops: (groupid=0, jobs=1): err= 0: pid=1588
>  read : io=1024MB, bw=20101KB/s, iops=5025, runt= 52166msec
>    slat (usec): min=4, max=6461, avg=24.00, stdev=19.75
>    clat (usec): min=0, max=11934, avg=169.49, stdev=113.91
>    bw (KB/s) : min=18200, max=23048, per=100.03%, avg=20106.31, stdev=934.42
>  cpu          : usr=5.43%, sys=23.25%, ctx=262363, majf=0, minf=28
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=262144/0, short=0/0
>     lat (usec): 2=0.01%, 4=0.16%, 10=0.03%, 20=0.01%, 50=0.27%
>     lat (usec): 100=4.07%, 250=89.12%, 500=5.76%, 750=0.30%, 1000=0.13%
>     lat (msec): 2=0.12%, 4=0.02%, 10=0.01%, 20=0.01%
>
> Run status group 0 (all jobs):
>   READ: io=1024MB, aggrb=20100KB/s, minb=20583KB/s, maxb=20583KB/s,
> mint=52166msec, maxt=52166msec
>
> Disk stats (read/write):
>  vdb: ios=261308/0, merge=0/0, ticks=40210/0, in_queue=40110, util=77.14%
> =======================================================================
>
>
> So I think, the patch for coroutine-ucontext isn't about the bottleneck
> I'm looking for.

Try turning ioeventfd off for the virtio-blk device:

-device virtio-blk-pci,ioeventfd=off,...

You might see better performance since ramdisk I/O should be very
low-latency.  The overhead of using ioeventfd might not make it
worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
it helps avoid stealing vcpu time and also causing lock contention
inside the guest - but if host I/O latency is extremely low it might
be faster to issue I/O from the vcpu thread.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-21 16:45     ` [Qemu-devel] " Dongsu Park
@ 2012-02-21 22:16       ` Rusty Russell
  -1 siblings, 0 replies; 52+ messages in thread
From: Rusty Russell @ 2012-02-21 22:16 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Tue, 21 Feb 2012 17:45:08 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> Hi Rusty,
> 
> On 13.02.2012 10:25, Rusty Russell wrote:
> > On Fri, 10 Feb 2012 15:36:39 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> > > Hi,
> > > 
> > > Recently I observed performance regression regarding virtio-blk,
> > > especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> > > So I want to share the benchmark results, and ask you what the reason
> > > would be.
> > 
> > Interesting.  There are two obvious possibilities here.  One is that
> > qemu has regressed, the other is that virtio_blk has regressed; the new
> > qemu may negotiate new features.  Please do the following in the guest
> > with old and new qemus:
> > 
> > cat /sys/class/block/vdb/device/features
> > 
> > (eg, here that gives: 0010101101100000000000000000100e0).
> 
> I did that on guest VM, using both qemu-kvm 0.14.1 and 1.0.
> (cat /sys/class/block/vdb/device/features)
> 
> using qemu-kvm 0.14.1:
> 
> 0010101101100000000000000000100000000000000000000000000000000000
> 
> using qemu-kvm 1.0:
> 
> 0010101101100000000000000000110000000000000000000000000000000000
> 
> >From my understanding, both of them have the same virtio features.
> Please correct me if I'm wrong.

Well, 1.0 supports event index (feature 29), but that's the only
difference.

This seems very much like a qemu regression.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-21 22:16       ` Rusty Russell
  0 siblings, 0 replies; 52+ messages in thread
From: Rusty Russell @ 2012-02-21 22:16 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Tue, 21 Feb 2012 17:45:08 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> Hi Rusty,
> 
> On 13.02.2012 10:25, Rusty Russell wrote:
> > On Fri, 10 Feb 2012 15:36:39 +0100, Dongsu Park <dongsu.park@profitbricks.com> wrote:
> > > Hi,
> > > 
> > > Recently I observed performance regression regarding virtio-blk,
> > > especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> > > So I want to share the benchmark results, and ask you what the reason
> > > would be.
> > 
> > Interesting.  There are two obvious possibilities here.  One is that
> > qemu has regressed, the other is that virtio_blk has regressed; the new
> > qemu may negotiate new features.  Please do the following in the guest
> > with old and new qemus:
> > 
> > cat /sys/class/block/vdb/device/features
> > 
> > (eg, here that gives: 0010101101100000000000000000100e0).
> 
> I did that on guest VM, using both qemu-kvm 0.14.1 and 1.0.
> (cat /sys/class/block/vdb/device/features)
> 
> using qemu-kvm 0.14.1:
> 
> 0010101101100000000000000000100000000000000000000000000000000000
> 
> using qemu-kvm 1.0:
> 
> 0010101101100000000000000000110000000000000000000000000000000000
> 
> >From my understanding, both of them have the same virtio features.
> Please correct me if I'm wrong.

Well, 1.0 supports event index (feature 29), but that's the only
difference.

This seems very much like a qemu regression.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-21 17:27       ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-02-22 16:48         ` Dongsu Park
  -1 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-22 16:48 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

Hi Stefan,

see below.

On 21.02.2012 17:27, Stefan Hajnoczi wrote:
> On Tue, Feb 21, 2012 at 3:57 PM, Dongsu Park
> <dongsu.park@profitbricks.com> wrote:
...<snip>...
> I'm not sure if O_DIRECT and Linux AIO to /dev/ram0 is a good idea.
> At least with tmpfs O_DIRECT does not even work - which kind of makes
> sense there because tmpfs lives in the page cache.  My point here is
> that ramdisk does not follow the same rules or have the same
> performance characteristics as real disks do.  It's something to be
> careful about.  Did you run this test because you noticed a real-world
> regression?

That's a good point.
I agree with you. /dev/ram0 isn't a good choice in this case.
Of course I noticed real-world regressions, but not with /dev/ram0.

Therefore I tested again with a block device backed by a raw file image.
Its result was however nearly the same: regression since 0.15.

...<snip>...
> Try turning ioeventfd off for the virtio-blk device:
> 
> -device virtio-blk-pci,ioeventfd=off,...
> 
> You might see better performance since ramdisk I/O should be very
> low-latency.  The overhead of using ioeventfd might not make it
> worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
> it helps avoid stealing vcpu time and also causing lock contention
> inside the guest - but if host I/O latency is extremely low it might
> be faster to issue I/O from the vcpu thread.

Thanks for the tip. I tried that too, but no success.

However, today I observed interesting phenomenen.
On qemu-kvm command-line, if I set -smp maxcpus to 32,
R/W bandwidth gets boosted up to 100 MBps.

# /usr/bin/kvm ...
 -smp 2,cores=1,maxcpus=32,threads=1 -numa mynode,mem=32G,nodeid=mynodeid

That looks weird, because my test machine has only 4 physical CPUs.
But setting maxcpus=4 brings only poor performance.(< 30 MBps)

Additionally, performance seems to decrease if more vCPUs are pinned.
In libvirt xml, for example, "<vcpu cpuset='0-1'>2</vcpu>" causes
performance degradation, but "<vcpu cpuset='1'>2</vcpu>" is ok.
That doesn't look reasonable either.

Cheers,
Dongsu

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-22 16:48         ` Dongsu Park
  0 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-02-22 16:48 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm

Hi Stefan,

see below.

On 21.02.2012 17:27, Stefan Hajnoczi wrote:
> On Tue, Feb 21, 2012 at 3:57 PM, Dongsu Park
> <dongsu.park@profitbricks.com> wrote:
...<snip>...
> I'm not sure if O_DIRECT and Linux AIO to /dev/ram0 is a good idea.
> At least with tmpfs O_DIRECT does not even work - which kind of makes
> sense there because tmpfs lives in the page cache.  My point here is
> that ramdisk does not follow the same rules or have the same
> performance characteristics as real disks do.  It's something to be
> careful about.  Did you run this test because you noticed a real-world
> regression?

That's a good point.
I agree with you. /dev/ram0 isn't a good choice in this case.
Of course I noticed real-world regressions, but not with /dev/ram0.

Therefore I tested again with a block device backed by a raw file image.
Its result was however nearly the same: regression since 0.15.

...<snip>...
> Try turning ioeventfd off for the virtio-blk device:
> 
> -device virtio-blk-pci,ioeventfd=off,...
> 
> You might see better performance since ramdisk I/O should be very
> low-latency.  The overhead of using ioeventfd might not make it
> worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
> it helps avoid stealing vcpu time and also causing lock contention
> inside the guest - but if host I/O latency is extremely low it might
> be faster to issue I/O from the vcpu thread.

Thanks for the tip. I tried that too, but no success.

However, today I observed interesting phenomenen.
On qemu-kvm command-line, if I set -smp maxcpus to 32,
R/W bandwidth gets boosted up to 100 MBps.

# /usr/bin/kvm ...
 -smp 2,cores=1,maxcpus=32,threads=1 -numa mynode,mem=32G,nodeid=mynodeid

That looks weird, because my test machine has only 4 physical CPUs.
But setting maxcpus=4 brings only poor performance.(< 30 MBps)

Additionally, performance seems to decrease if more vCPUs are pinned.
In libvirt xml, for example, "<vcpu cpuset='0-1'>2</vcpu>" causes
performance degradation, but "<vcpu cpuset='1'>2</vcpu>" is ok.
That doesn't look reasonable either.

Cheers,
Dongsu

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-22 16:48         ` [Qemu-devel] " Dongsu Park
@ 2012-02-22 19:53           ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-22 19:53 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Wed, Feb 22, 2012 at 4:48 PM, Dongsu Park
<dongsu.park@profitbricks.com> wrote:
>> Try turning ioeventfd off for the virtio-blk device:
>>
>> -device virtio-blk-pci,ioeventfd=off,...
>>
>> You might see better performance since ramdisk I/O should be very
>> low-latency.  The overhead of using ioeventfd might not make it
>> worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
>> it helps avoid stealing vcpu time and also causing lock contention
>> inside the guest - but if host I/O latency is extremely low it might
>> be faster to issue I/O from the vcpu thread.
>
> Thanks for the tip. I tried that too, but no success.

My guesses have all been wrong.  Maybe it's time to git bisect this instead :).

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-22 19:53           ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-22 19:53 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm

On Wed, Feb 22, 2012 at 4:48 PM, Dongsu Park
<dongsu.park@profitbricks.com> wrote:
>> Try turning ioeventfd off for the virtio-blk device:
>>
>> -device virtio-blk-pci,ioeventfd=off,...
>>
>> You might see better performance since ramdisk I/O should be very
>> low-latency.  The overhead of using ioeventfd might not make it
>> worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
>> it helps avoid stealing vcpu time and also causing lock contention
>> inside the guest - but if host I/O latency is extremely low it might
>> be faster to issue I/O from the vcpu thread.
>
> Thanks for the tip. I tried that too, but no success.

My guesses have all been wrong.  Maybe it's time to git bisect this instead :).

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-22 19:53           ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-02-28 16:39             ` Martin Mailand
  -1 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-02-28 16:39 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, qemu-devel, kvm

Hi,
I could reproduce it and I bisected it down to this commit.

12d4536f7d911b6d87a766ad7300482ea663cea2 is the first bad commit
commit 12d4536f7d911b6d87a766ad7300482ea663cea2
Author: Anthony Liguori <aliguori@us.ibm.com>
Date:   Mon Aug 22 08:24:58 2011 -0500


-martin


On 22.02.2012 20:53, Stefan Hajnoczi wrote:
> On Wed, Feb 22, 2012 at 4:48 PM, Dongsu Park
> <dongsu.park@profitbricks.com>  wrote:
>>> Try turning ioeventfd off for the virtio-blk device:
>>>
>>> -device virtio-blk-pci,ioeventfd=off,...
>>>
>>> You might see better performance since ramdisk I/O should be very
>>> low-latency.  The overhead of using ioeventfd might not make it
>>> worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
>>> it helps avoid stealing vcpu time and also causing lock contention
>>> inside the guest - but if host I/O latency is extremely low it might
>>> be faster to issue I/O from the vcpu thread.
>> Thanks for the tip. I tried that too, but no success.
> My guesses have all been wrong.  Maybe it's time to git bisect this instead :).
>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-28 16:39             ` Martin Mailand
  0 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-02-28 16:39 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, kvm, qemu-devel

Hi,
I could reproduce it and I bisected it down to this commit.

12d4536f7d911b6d87a766ad7300482ea663cea2 is the first bad commit
commit 12d4536f7d911b6d87a766ad7300482ea663cea2
Author: Anthony Liguori <aliguori@us.ibm.com>
Date:   Mon Aug 22 08:24:58 2011 -0500


-martin


On 22.02.2012 20:53, Stefan Hajnoczi wrote:
> On Wed, Feb 22, 2012 at 4:48 PM, Dongsu Park
> <dongsu.park@profitbricks.com>  wrote:
>>> Try turning ioeventfd off for the virtio-blk device:
>>>
>>> -device virtio-blk-pci,ioeventfd=off,...
>>>
>>> You might see better performance since ramdisk I/O should be very
>>> low-latency.  The overhead of using ioeventfd might not make it
>>> worthwhile.  The ioeventfd feature was added post-0.14 IIRC.  Normally
>>> it helps avoid stealing vcpu time and also causing lock contention
>>> inside the guest - but if host I/O latency is extremely low it might
>>> be faster to issue I/O from the vcpu thread.
>> Thanks for the tip. I tried that too, but no success.
> My guesses have all been wrong.  Maybe it's time to git bisect this instead :).
>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-28 16:39             ` [Qemu-devel] " Martin Mailand
@ 2012-02-28 17:05               ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-28 17:05 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, qemu-devel, kvm

On Tue, Feb 28, 2012 at 4:39 PM, Martin Mailand <martin@tuxadero.com> wrote:
> I could reproduce it and I bisected it down to this commit.
>
> 12d4536f7d911b6d87a766ad7300482ea663cea2 is the first bad commit
> commit 12d4536f7d911b6d87a766ad7300482ea663cea2
> Author: Anthony Liguori <aliguori@us.ibm.com>
> Date:   Mon Aug 22 08:24:58 2011 -0500

This seems strange to me.

What commit 12d4536f7 did was to switch to a threading model in
*qemu.git* that is similar to what *qemu-kvm.git* has been doing all
along.

That means the qemu-kvm binaries already use the iothread model.  The
only explanation I have is that your bisect went down a qemu.git path
and you therefore tripped over this - but in practice it should not
account for a difference between qemu-kvm 0.14.1 and 1.0.

Can you please confirm that you are bisecting qemu-kvm.git and not qemu.git?

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-28 17:05               ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-28 17:05 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, kvm, qemu-devel

On Tue, Feb 28, 2012 at 4:39 PM, Martin Mailand <martin@tuxadero.com> wrote:
> I could reproduce it and I bisected it down to this commit.
>
> 12d4536f7d911b6d87a766ad7300482ea663cea2 is the first bad commit
> commit 12d4536f7d911b6d87a766ad7300482ea663cea2
> Author: Anthony Liguori <aliguori@us.ibm.com>
> Date:   Mon Aug 22 08:24:58 2011 -0500

This seems strange to me.

What commit 12d4536f7 did was to switch to a threading model in
*qemu.git* that is similar to what *qemu-kvm.git* has been doing all
along.

That means the qemu-kvm binaries already use the iothread model.  The
only explanation I have is that your bisect went down a qemu.git path
and you therefore tripped over this - but in practice it should not
account for a difference between qemu-kvm 0.14.1 and 1.0.

Can you please confirm that you are bisecting qemu-kvm.git and not qemu.git?

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-28 17:05               ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-02-28 17:15                 ` Martin Mailand
  -1 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-02-28 17:15 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, qemu-devel, kvm

Hi Stefan,
I was bisecting qemu-kvm.git.

  git remote show origin
* remote origin
   Fetch URL: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git
   Push  URL: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git

The bisect log is:

git bisect start
# good: [b8095f24f24e50a7d4be33d8a79474aff3324295] Bump version to 
reflect v0.15.0-rc0
git bisect good b8095f24f24e50a7d4be33d8a79474aff3324295
# bad: [e072ea2fd8fdceef64159b9596d3c15ce01bea91] Bump version to 1.0-rc0
git bisect bad e072ea2fd8fdceef64159b9596d3c15ce01bea91
# bad: [7d4b4ba5c2bae99d44f265884b567ae63947bb4a] block: New 
change_media_cb() parameter load
git bisect bad 7d4b4ba5c2bae99d44f265884b567ae63947bb4a
# good: [baaa86d9f5d516d423d34af92e0c15b56e06ac4b] hw/9pfs: Update 
v9fs_create to use coroutines
git bisect good baaa86d9f5d516d423d34af92e0c15b56e06ac4b
# bad: [9aed1e036dc0de49d08d713f9e5c4655e94acb56] Rename qemu -> 
qemu-system-i386
git bisect bad 9aed1e036dc0de49d08d713f9e5c4655e94acb56
# good: [8ef9ea85a2cc1007eaefa53e6871f1f83bcef22d] Merge remote-tracking 
branch 'qemu-kvm/memory/batch' into staging
git bisect good 8ef9ea85a2cc1007eaefa53e6871f1f83bcef22d
# good: [9f4bd6baf64b8139cf2d7f8f53a98b27531da13c] Merge remote-tracking 
branch 'kwolf/for-anthony' into staging
git bisect good 9f4bd6baf64b8139cf2d7f8f53a98b27531da13c
# good: [09001ee7b27b9b5f049362efc427d03e2186a431] trace: [make] replace 
'ifeq' with values in CONFIG_TRACE_*
git bisect good 09001ee7b27b9b5f049362efc427d03e2186a431
# good: [d8e8ef4ee05bfee0df84e2665d9196c4a954c095] simpletrace: fix 
process() argument count
git bisect good d8e8ef4ee05bfee0df84e2665d9196c4a954c095
# good: [a952c570c865d5eae6c148716f2cb585a0d3a2ee] Merge remote-tracking 
branch 'qemu-kvm-tmp/memory/core' into staging
git bisect good a952c570c865d5eae6c148716f2cb585a0d3a2ee
# good: [625f9e1f54cd78ee98ac22030da527c9a1cc9d2b] Merge remote-tracking 
branch 'stefanha/trivial-patches' into staging
git bisect good 625f9e1f54cd78ee98ac22030da527c9a1cc9d2b
# good: [d9cd446b4f6ff464f9520898116534de988d9bc1] trace: fix 
out-of-tree builds
git bisect good d9cd446b4f6ff464f9520898116534de988d9bc1
# bad: [12d4536f7d911b6d87a766ad7300482ea663cea2] main: force enabling 
of I/O thread
git bisect bad 12d4536f7d911b6d87a766ad7300482ea663cea2

-martin

On 28.02.2012 18:05, Stefan Hajnoczi wrote:
> On Tue, Feb 28, 2012 at 4:39 PM, Martin Mailand<martin@tuxadero.com>  wrote:
>> I could reproduce it and I bisected it down to this commit.
>>
>> 12d4536f7d911b6d87a766ad7300482ea663cea2 is the first bad commit
>> commit 12d4536f7d911b6d87a766ad7300482ea663cea2
>> Author: Anthony Liguori<aliguori@us.ibm.com>
>> Date:   Mon Aug 22 08:24:58 2011 -0500
> This seems strange to me.
>
> What commit 12d4536f7 did was to switch to a threading model in
> *qemu.git* that is similar to what *qemu-kvm.git* has been doing all
> along.
>
> That means the qemu-kvm binaries already use the iothread model.  The
> only explanation I have is that your bisect went down a qemu.git path
> and you therefore tripped over this - but in practice it should not
> account for a difference between qemu-kvm 0.14.1 and 1.0.
>
> Can you please confirm that you are bisecting qemu-kvm.git and not qemu.git?
>
> Stefan


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-28 17:15                 ` Martin Mailand
  0 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-02-28 17:15 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, kvm, qemu-devel

Hi Stefan,
I was bisecting qemu-kvm.git.

  git remote show origin
* remote origin
   Fetch URL: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git
   Push  URL: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git

The bisect log is:

git bisect start
# good: [b8095f24f24e50a7d4be33d8a79474aff3324295] Bump version to 
reflect v0.15.0-rc0
git bisect good b8095f24f24e50a7d4be33d8a79474aff3324295
# bad: [e072ea2fd8fdceef64159b9596d3c15ce01bea91] Bump version to 1.0-rc0
git bisect bad e072ea2fd8fdceef64159b9596d3c15ce01bea91
# bad: [7d4b4ba5c2bae99d44f265884b567ae63947bb4a] block: New 
change_media_cb() parameter load
git bisect bad 7d4b4ba5c2bae99d44f265884b567ae63947bb4a
# good: [baaa86d9f5d516d423d34af92e0c15b56e06ac4b] hw/9pfs: Update 
v9fs_create to use coroutines
git bisect good baaa86d9f5d516d423d34af92e0c15b56e06ac4b
# bad: [9aed1e036dc0de49d08d713f9e5c4655e94acb56] Rename qemu -> 
qemu-system-i386
git bisect bad 9aed1e036dc0de49d08d713f9e5c4655e94acb56
# good: [8ef9ea85a2cc1007eaefa53e6871f1f83bcef22d] Merge remote-tracking 
branch 'qemu-kvm/memory/batch' into staging
git bisect good 8ef9ea85a2cc1007eaefa53e6871f1f83bcef22d
# good: [9f4bd6baf64b8139cf2d7f8f53a98b27531da13c] Merge remote-tracking 
branch 'kwolf/for-anthony' into staging
git bisect good 9f4bd6baf64b8139cf2d7f8f53a98b27531da13c
# good: [09001ee7b27b9b5f049362efc427d03e2186a431] trace: [make] replace 
'ifeq' with values in CONFIG_TRACE_*
git bisect good 09001ee7b27b9b5f049362efc427d03e2186a431
# good: [d8e8ef4ee05bfee0df84e2665d9196c4a954c095] simpletrace: fix 
process() argument count
git bisect good d8e8ef4ee05bfee0df84e2665d9196c4a954c095
# good: [a952c570c865d5eae6c148716f2cb585a0d3a2ee] Merge remote-tracking 
branch 'qemu-kvm-tmp/memory/core' into staging
git bisect good a952c570c865d5eae6c148716f2cb585a0d3a2ee
# good: [625f9e1f54cd78ee98ac22030da527c9a1cc9d2b] Merge remote-tracking 
branch 'stefanha/trivial-patches' into staging
git bisect good 625f9e1f54cd78ee98ac22030da527c9a1cc9d2b
# good: [d9cd446b4f6ff464f9520898116534de988d9bc1] trace: fix 
out-of-tree builds
git bisect good d9cd446b4f6ff464f9520898116534de988d9bc1
# bad: [12d4536f7d911b6d87a766ad7300482ea663cea2] main: force enabling 
of I/O thread
git bisect bad 12d4536f7d911b6d87a766ad7300482ea663cea2

-martin

On 28.02.2012 18:05, Stefan Hajnoczi wrote:
> On Tue, Feb 28, 2012 at 4:39 PM, Martin Mailand<martin@tuxadero.com>  wrote:
>> I could reproduce it and I bisected it down to this commit.
>>
>> 12d4536f7d911b6d87a766ad7300482ea663cea2 is the first bad commit
>> commit 12d4536f7d911b6d87a766ad7300482ea663cea2
>> Author: Anthony Liguori<aliguori@us.ibm.com>
>> Date:   Mon Aug 22 08:24:58 2011 -0500
> This seems strange to me.
>
> What commit 12d4536f7 did was to switch to a threading model in
> *qemu.git* that is similar to what *qemu-kvm.git* has been doing all
> along.
>
> That means the qemu-kvm binaries already use the iothread model.  The
> only explanation I have is that your bisect went down a qemu.git path
> and you therefore tripped over this - but in practice it should not
> account for a difference between qemu-kvm 0.14.1 and 1.0.
>
> Can you please confirm that you are bisecting qemu-kvm.git and not qemu.git?
>
> Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-28 17:15                 ` [Qemu-devel] " Martin Mailand
@ 2012-02-29  8:38                   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-29  8:38 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, qemu-devel, kvm

On Tue, Feb 28, 2012 at 5:15 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Hi Stefan,
> I was bisecting qemu-kvm.git.

qemu-kvm.git regularly merges from qemu.git.  The history of the
qemu-kvm.git repository is not linear because of these periodic merges
from the qemu.git tree.  I think what happened is that git bisect went
down a qemu.git merge, which resulted in you effectively testing
qemu.git instead of qemu-kvm.git!

You can see this by using gitk(1) or git log --graph and searching for
the bad commit (12d4536f7d).  This will show a merge commit
(0b9b128530b999e36f0629dddcbafeda114fb4fb) where these qemu.git
commits were brought into qemu-kvm.git's history.

qemu-kvm.git
|
...
|
* merge commit (0b9b128530b999e36f0629dddcbafeda114fb4fb)
|\
| * "bad" commit (12d4536f7d911b6d87a766ad7300482ea663cea2)
| |
| ... more qemu.git commits
|
* previous commit, also a merge commit
(4fefc55ab04dd77002750f771e96477b5d2a473f)

Bisect seems to have gone down the qemu.git side of the merge at
0b9b128530b.  Instead we need to go down the qemu-kvm.git side of the
history.

The key thing is that the second git-bisect steps from qemu-kvm.git
into qemu.git history the source tree will be fairly different and
performance between the two is not easy to compare.

I suggest testing both of the qemu-kvm.git merge commits, 0b9b128530b
and 4fefc55ab04d.  My guess is you will find they perform the same,
i.e. the qemu.git commits which were merged did not affect performance
in qemu-kvm.git.  That would be evidence that git-bisect has not
located the real culprit.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-29  8:38                   ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-29  8:38 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, kvm, qemu-devel

On Tue, Feb 28, 2012 at 5:15 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Hi Stefan,
> I was bisecting qemu-kvm.git.

qemu-kvm.git regularly merges from qemu.git.  The history of the
qemu-kvm.git repository is not linear because of these periodic merges
from the qemu.git tree.  I think what happened is that git bisect went
down a qemu.git merge, which resulted in you effectively testing
qemu.git instead of qemu-kvm.git!

You can see this by using gitk(1) or git log --graph and searching for
the bad commit (12d4536f7d).  This will show a merge commit
(0b9b128530b999e36f0629dddcbafeda114fb4fb) where these qemu.git
commits were brought into qemu-kvm.git's history.

qemu-kvm.git
|
...
|
* merge commit (0b9b128530b999e36f0629dddcbafeda114fb4fb)
|\
| * "bad" commit (12d4536f7d911b6d87a766ad7300482ea663cea2)
| |
| ... more qemu.git commits
|
* previous commit, also a merge commit
(4fefc55ab04dd77002750f771e96477b5d2a473f)

Bisect seems to have gone down the qemu.git side of the merge at
0b9b128530b.  Instead we need to go down the qemu-kvm.git side of the
history.

The key thing is that the second git-bisect steps from qemu-kvm.git
into qemu.git history the source tree will be fairly different and
performance between the two is not easy to compare.

I suggest testing both of the qemu-kvm.git merge commits, 0b9b128530b
and 4fefc55ab04d.  My guess is you will find they perform the same,
i.e. the qemu.git commits which were merged did not affect performance
in qemu-kvm.git.  That would be evidence that git-bisect has not
located the real culprit.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-29  8:38                   ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-02-29 13:12                     ` Martin Mailand
  -1 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-02-29 13:12 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, qemu-devel, kvm

Hi Stefan,
you are right, the performance for the commits 0b9b128530b and 
4fefc55ab04d are both good.
What is the best approach to stay in the qemu-kvm.git history?

-martin

On 29.02.2012 09:38, Stefan Hajnoczi wrote:
> I suggest testing both of the qemu-kvm.git merge commits, 0b9b128530b
> and 4fefc55ab04d.  My guess is you will find they perform the same,
> i.e. the qemu.git commits which were merged did not affect performance
> in qemu-kvm.git.  That would be evidence that git-bisect has not
> located the real culprit.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-29 13:12                     ` Martin Mailand
  0 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-02-29 13:12 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, kvm, qemu-devel

Hi Stefan,
you are right, the performance for the commits 0b9b128530b and 
4fefc55ab04d are both good.
What is the best approach to stay in the qemu-kvm.git history?

-martin

On 29.02.2012 09:38, Stefan Hajnoczi wrote:
> I suggest testing both of the qemu-kvm.git merge commits, 0b9b128530b
> and 4fefc55ab04d.  My guess is you will find they perform the same,
> i.e. the qemu.git commits which were merged did not affect performance
> in qemu-kvm.git.  That would be evidence that git-bisect has not
> located the real culprit.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-29 13:12                     ` [Qemu-devel] " Martin Mailand
@ 2012-02-29 13:44                       ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-29 13:44 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, qemu-devel, kvm

On Wed, Feb 29, 2012 at 1:12 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Hi Stefan,
> you are right, the performance for the commits 0b9b128530b and 4fefc55ab04d
> are both good.
> What is the best approach to stay in the qemu-kvm.git history?

I didn't know the answer so I asked on #git on freenode:

< charon> stefanha: so just tell it that the upstream tip is good
< charon> git-bisect assumes you are looking for a single commit C
such that any commit that contains C (in its history) is bad, and any
other commit is good. if you declare up front that upstream does not
contain C, then it won't go looking there
of course if that declaration was wrong, it will give wrong results.

I think there are two approaches:

1. Side-step this issue by bisecting qemu.git instead of qemu-kvm.git.

2. First test qemu-kvm.git origin/upstream-merge and if there is no
performance issue, do: git bisect good origin/upstream-merge.  If
we're luckily this will avoid going down all the qemu.git merge trees
and instead stay focussed on qemu-kvm.git commits.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-29 13:44                       ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-29 13:44 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, kvm, qemu-devel

On Wed, Feb 29, 2012 at 1:12 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Hi Stefan,
> you are right, the performance for the commits 0b9b128530b and 4fefc55ab04d
> are both good.
> What is the best approach to stay in the qemu-kvm.git history?

I didn't know the answer so I asked on #git on freenode:

< charon> stefanha: so just tell it that the upstream tip is good
< charon> git-bisect assumes you are looking for a single commit C
such that any commit that contains C (in its history) is bad, and any
other commit is good. if you declare up front that upstream does not
contain C, then it won't go looking there
of course if that declaration was wrong, it will give wrong results.

I think there are two approaches:

1. Side-step this issue by bisecting qemu.git instead of qemu-kvm.git.

2. First test qemu-kvm.git origin/upstream-merge and if there is no
performance issue, do: git bisect good origin/upstream-merge.  If
we're luckily this will avoid going down all the qemu.git merge trees
and instead stay focussed on qemu-kvm.git commits.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-29 13:44                       ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-02-29 13:52                         ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-29 13:52 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, qemu-devel, kvm

On Wed, Feb 29, 2012 at 1:44 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Wed, Feb 29, 2012 at 1:12 PM, Martin Mailand <martin@tuxadero.com> wrote:
>> Hi Stefan,
>> you are right, the performance for the commits 0b9b128530b and 4fefc55ab04d
>> are both good.
>> What is the best approach to stay in the qemu-kvm.git history?
>
> I didn't know the answer so I asked on #git on freenode:

And additional information from Avi:

http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/82851

Hope this helps.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-02-29 13:52                         ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-02-29 13:52 UTC (permalink / raw)
  To: martin; +Cc: Dongsu Park, kvm, qemu-devel

On Wed, Feb 29, 2012 at 1:44 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Wed, Feb 29, 2012 at 1:12 PM, Martin Mailand <martin@tuxadero.com> wrote:
>> Hi Stefan,
>> you are right, the performance for the commits 0b9b128530b and 4fefc55ab04d
>> are both good.
>> What is the best approach to stay in the qemu-kvm.git history?
>
> I didn't know the answer so I asked on #git on freenode:

And additional information from Avi:

http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/82851

Hope this helps.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-02-10 14:36 ` [Qemu-devel] " Dongsu Park
@ 2012-03-05 16:13   ` Martin Mailand
  -1 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-03-05 16:13 UTC (permalink / raw)
  To: Dongsu Park; +Cc: qemu-devel, kvm, stefanha

Am 10.02.2012 15:36, schrieb Dongsu Park:
> Recently I observed performance regression regarding virtio-blk,
> especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> So I want to share the benchmark results, and ask you what the reason
> would be.


Hi,
I think I found the problem, there is no regression in the code.
I think the problem is, that qmeu-kvm with the IO-Thread enabled doesn't 
produce enough cpu load to get the core to a higher cpu frequency, 
because the load is distributed to two threads.
If I change the cpu governor to "performance" the result from the master 
branch is better than from the v0.14.1 branch.
I get the same results on a serversystem without powermanagment activated.

@Dongsu Could you confirm those findings?


1. Test on i7 Laptop with Cpu governor "ondemand".

v0.14.1
bw=63492KB/s iops=15873
bw=63221KB/s iops=15805

v1.0
bw=36696KB/s iops=9173
bw=37404KB/s iops=9350

master
bw=36396KB/s iops=9099
bw=34182KB/s iops=8545

Change the Cpu governor to "performance"
master
bw=81756KB/s iops=20393
bw=81453KB/s iops=20257


2. Test on AMD Istanbul without powermanagement activated.

v0.14.1
bw=53167KB/s iops=13291
bw=61386KB/s iops=15346

v1.0
bw=43599KB/s iops=10899
bw=46288KB/s iops=11572

master
bw=60678KB/s iops=15169
bw=62733KB/s iops=15683

-martin

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-05 16:13   ` Martin Mailand
  0 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-03-05 16:13 UTC (permalink / raw)
  To: Dongsu Park; +Cc: stefanha, qemu-devel, kvm

Am 10.02.2012 15:36, schrieb Dongsu Park:
> Recently I observed performance regression regarding virtio-blk,
> especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> So I want to share the benchmark results, and ask you what the reason
> would be.


Hi,
I think I found the problem, there is no regression in the code.
I think the problem is, that qmeu-kvm with the IO-Thread enabled doesn't 
produce enough cpu load to get the core to a higher cpu frequency, 
because the load is distributed to two threads.
If I change the cpu governor to "performance" the result from the master 
branch is better than from the v0.14.1 branch.
I get the same results on a serversystem without powermanagment activated.

@Dongsu Could you confirm those findings?


1. Test on i7 Laptop with Cpu governor "ondemand".

v0.14.1
bw=63492KB/s iops=15873
bw=63221KB/s iops=15805

v1.0
bw=36696KB/s iops=9173
bw=37404KB/s iops=9350

master
bw=36396KB/s iops=9099
bw=34182KB/s iops=8545

Change the Cpu governor to "performance"
master
bw=81756KB/s iops=20393
bw=81453KB/s iops=20257


2. Test on AMD Istanbul without powermanagement activated.

v0.14.1
bw=53167KB/s iops=13291
bw=61386KB/s iops=15346

v1.0
bw=43599KB/s iops=10899
bw=46288KB/s iops=11572

master
bw=60678KB/s iops=15169
bw=62733KB/s iops=15683

-martin

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-03-05 16:13   ` [Qemu-devel] " Martin Mailand
@ 2012-03-05 16:35     ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-03-05 16:35 UTC (permalink / raw)
  To: Martin Mailand; +Cc: Dongsu Park, qemu-devel, kvm

On Mon, Mar 5, 2012 at 4:13 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Am 10.02.2012 15:36, schrieb Dongsu Park:
>
>> Recently I observed performance regression regarding virtio-blk,
>> especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
>> So I want to share the benchmark results, and ask you what the reason
>> would be.
>
>
>
> Hi,
> I think I found the problem, there is no regression in the code.
> I think the problem is, that qmeu-kvm with the IO-Thread enabled doesn't
> produce enough cpu load to get the core to a higher cpu frequency, because
> the load is distributed to two threads.
> If I change the cpu governor to "performance" the result from the master
> branch is better than from the v0.14.1 branch.
> I get the same results on a serversystem without powermanagment activated.
>
> @Dongsu Could you confirm those findings?
>
>
> 1. Test on i7 Laptop with Cpu governor "ondemand".
>
> v0.14.1
> bw=63492KB/s iops=15873
> bw=63221KB/s iops=15805
>
> v1.0
> bw=36696KB/s iops=9173
> bw=37404KB/s iops=9350
>
> master
> bw=36396KB/s iops=9099
> bw=34182KB/s iops=8545
>
> Change the Cpu governor to "performance"
> master
> bw=81756KB/s iops=20393
> bw=81453KB/s iops=20257

Interesting finding.  Did you show the 0.14.1 results with
"performance" governor?

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-05 16:35     ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-03-05 16:35 UTC (permalink / raw)
  To: Martin Mailand; +Cc: Dongsu Park, kvm, qemu-devel

On Mon, Mar 5, 2012 at 4:13 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Am 10.02.2012 15:36, schrieb Dongsu Park:
>
>> Recently I observed performance regression regarding virtio-blk,
>> especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
>> So I want to share the benchmark results, and ask you what the reason
>> would be.
>
>
>
> Hi,
> I think I found the problem, there is no regression in the code.
> I think the problem is, that qmeu-kvm with the IO-Thread enabled doesn't
> produce enough cpu load to get the core to a higher cpu frequency, because
> the load is distributed to two threads.
> If I change the cpu governor to "performance" the result from the master
> branch is better than from the v0.14.1 branch.
> I get the same results on a serversystem without powermanagment activated.
>
> @Dongsu Could you confirm those findings?
>
>
> 1. Test on i7 Laptop with Cpu governor "ondemand".
>
> v0.14.1
> bw=63492KB/s iops=15873
> bw=63221KB/s iops=15805
>
> v1.0
> bw=36696KB/s iops=9173
> bw=37404KB/s iops=9350
>
> master
> bw=36396KB/s iops=9099
> bw=34182KB/s iops=8545
>
> Change the Cpu governor to "performance"
> master
> bw=81756KB/s iops=20393
> bw=81453KB/s iops=20257

Interesting finding.  Did you show the 0.14.1 results with
"performance" governor?

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-03-05 16:35     ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-03-05 16:44       ` Martin Mailand
  -1 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-03-05 16:44 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, qemu-devel, kvm

Am 05.03.2012 17:35, schrieb Stefan Hajnoczi:
>> 1. Test on i7 Laptop with Cpu governor "ondemand".
>> >
>> >  v0.14.1
>> >  bw=63492KB/s iops=15873
>> >  bw=63221KB/s iops=15805
>> >
>> >  v1.0
>> >  bw=36696KB/s iops=9173
>> >  bw=37404KB/s iops=9350
>> >
>> >  master
>> >  bw=36396KB/s iops=9099
>> >  bw=34182KB/s iops=8545
>> >
>> >  Change the Cpu governor to "performance"
>> >  master
>> >  bw=81756KB/s iops=20393
>> >  bw=81453KB/s iops=20257
> Interesting finding.  Did you show the 0.14.1 results with
> "performance" governor?


Hi Stefan,
all results are with "ondemand" except the one where I changed it to 
"performance"

Do you want a v0.14.1 test with the governor on "performance"?

-martin


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-05 16:44       ` Martin Mailand
  0 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-03-05 16:44 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Dongsu Park, kvm, qemu-devel

Am 05.03.2012 17:35, schrieb Stefan Hajnoczi:
>> 1. Test on i7 Laptop with Cpu governor "ondemand".
>> >
>> >  v0.14.1
>> >  bw=63492KB/s iops=15873
>> >  bw=63221KB/s iops=15805
>> >
>> >  v1.0
>> >  bw=36696KB/s iops=9173
>> >  bw=37404KB/s iops=9350
>> >
>> >  master
>> >  bw=36396KB/s iops=9099
>> >  bw=34182KB/s iops=8545
>> >
>> >  Change the Cpu governor to "performance"
>> >  master
>> >  bw=81756KB/s iops=20393
>> >  bw=81453KB/s iops=20257
> Interesting finding.  Did you show the 0.14.1 results with
> "performance" governor?


Hi Stefan,
all results are with "ondemand" except the one where I changed it to 
"performance"

Do you want a v0.14.1 test with the governor on "performance"?

-martin

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-03-05 16:44       ` [Qemu-devel] " Martin Mailand
@ 2012-03-06 12:59         ` Stefan Hajnoczi
  -1 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-03-06 12:59 UTC (permalink / raw)
  To: Martin Mailand; +Cc: Dongsu Park, qemu-devel, kvm

On Mon, Mar 5, 2012 at 4:44 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Am 05.03.2012 17:35, schrieb Stefan Hajnoczi:
>
>>> 1. Test on i7 Laptop with Cpu governor "ondemand".
>>> >
>>> >  v0.14.1
>>> >  bw=63492KB/s iops=15873
>>> >  bw=63221KB/s iops=15805
>>> >
>>> >  v1.0
>>> >  bw=36696KB/s iops=9173
>>> >  bw=37404KB/s iops=9350
>>> >
>>> >  master
>>> >  bw=36396KB/s iops=9099
>>> >  bw=34182KB/s iops=8545
>>> >
>>> >  Change the Cpu governor to "performance"
>>> >  master
>>> >  bw=81756KB/s iops=20393
>>> >  bw=81453KB/s iops=20257
>>
>> Interesting finding.  Did you show the 0.14.1 results with
>> "performance" governor?
>
>
>
> Hi Stefan,
> all results are with "ondemand" except the one where I changed it to
> "performance"
>
> Do you want a v0.14.1 test with the governor on "performance"?

Yes, the reason why that would be interesting is because it allows us
to put the performance gain with master+"performance" into
perspective.  We could see how much of a change we get.

Does the CPU governor also affect the result when you benchmark with
real disks instead of ramdisk?  I can see how the governor would
affect ramdisk, but would expect real disk I/O to be impacted much
less.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-06 12:59         ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-03-06 12:59 UTC (permalink / raw)
  To: Martin Mailand; +Cc: Dongsu Park, kvm, qemu-devel

On Mon, Mar 5, 2012 at 4:44 PM, Martin Mailand <martin@tuxadero.com> wrote:
> Am 05.03.2012 17:35, schrieb Stefan Hajnoczi:
>
>>> 1. Test on i7 Laptop with Cpu governor "ondemand".
>>> >
>>> >  v0.14.1
>>> >  bw=63492KB/s iops=15873
>>> >  bw=63221KB/s iops=15805
>>> >
>>> >  v1.0
>>> >  bw=36696KB/s iops=9173
>>> >  bw=37404KB/s iops=9350
>>> >
>>> >  master
>>> >  bw=36396KB/s iops=9099
>>> >  bw=34182KB/s iops=8545
>>> >
>>> >  Change the Cpu governor to "performance"
>>> >  master
>>> >  bw=81756KB/s iops=20393
>>> >  bw=81453KB/s iops=20257
>>
>> Interesting finding.  Did you show the 0.14.1 results with
>> "performance" governor?
>
>
>
> Hi Stefan,
> all results are with "ondemand" except the one where I changed it to
> "performance"
>
> Do you want a v0.14.1 test with the governor on "performance"?

Yes, the reason why that would be interesting is because it allows us
to put the performance gain with master+"performance" into
perspective.  We could see how much of a change we get.

Does the CPU governor also affect the result when you benchmark with
real disks instead of ramdisk?  I can see how the governor would
affect ramdisk, but would expect real disk I/O to be impacted much
less.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-03-05 16:13   ` [Qemu-devel] " Martin Mailand
@ 2012-03-06 14:32     ` Dongsu Park
  -1 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-03-06 14:32 UTC (permalink / raw)
  To: Martin Mailand; +Cc: qemu-devel, kvm, stefanha

Hi Martin,

On 05.03.2012 17:13, Martin Mailand wrote:
> Am 10.02.2012 15:36, schrieb Dongsu Park:
> >Recently I observed performance regression regarding virtio-blk,
> >especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> >So I want to share the benchmark results, and ask you what the reason
> >would be.
> 
> 
> Hi,
> I think I found the problem, there is no regression in the code.
> I think the problem is, that qmeu-kvm with the IO-Thread enabled
> doesn't produce enough cpu load to get the core to a higher cpu
> frequency, because the load is distributed to two threads.
> If I change the cpu governor to "performance" the result from the
> master branch is better than from the v0.14.1 branch.
> I get the same results on a serversystem without powermanagment activated.
> 
> @Dongsu Could you confirm those findings?

Yes, I can confirm that.
I just tested with different CPU governor configs, "ondemand" and 
"performance". (qemu-kvm 1.0, AMD Phenom II X4 955)

The result is more or less like yours.
Bandwidth with "performance" gets more than 100% better than that with
"ondemand".
That looks definitely one of the reasons of regressions I experienced.
Actually I had always tested with "ondemand", which I haven't noticed.
 
Good catch, thanks!
Dongsu


> 1. Test on i7 Laptop with Cpu governor "ondemand".
> 
> v0.14.1
> bw=63492KB/s iops=15873
> bw=63221KB/s iops=15805
> 
> v1.0
> bw=36696KB/s iops=9173
> bw=37404KB/s iops=9350
> 
> master
> bw=36396KB/s iops=9099
> bw=34182KB/s iops=8545
> 
> Change the Cpu governor to "performance"
> master
> bw=81756KB/s iops=20393
> bw=81453KB/s iops=20257
> 
> 
> 2. Test on AMD Istanbul without powermanagement activated.
> 
> v0.14.1
> bw=53167KB/s iops=13291
> bw=61386KB/s iops=15346
> 
> v1.0
> bw=43599KB/s iops=10899
> bw=46288KB/s iops=11572
> 
> master
> bw=60678KB/s iops=15169
> bw=62733KB/s iops=15683
> 
> -martin

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-06 14:32     ` Dongsu Park
  0 siblings, 0 replies; 52+ messages in thread
From: Dongsu Park @ 2012-03-06 14:32 UTC (permalink / raw)
  To: Martin Mailand; +Cc: stefanha, qemu-devel, kvm

Hi Martin,

On 05.03.2012 17:13, Martin Mailand wrote:
> Am 10.02.2012 15:36, schrieb Dongsu Park:
> >Recently I observed performance regression regarding virtio-blk,
> >especially different IO bandwidths between qemu-kvm 0.14.1 and 1.0.
> >So I want to share the benchmark results, and ask you what the reason
> >would be.
> 
> 
> Hi,
> I think I found the problem, there is no regression in the code.
> I think the problem is, that qmeu-kvm with the IO-Thread enabled
> doesn't produce enough cpu load to get the core to a higher cpu
> frequency, because the load is distributed to two threads.
> If I change the cpu governor to "performance" the result from the
> master branch is better than from the v0.14.1 branch.
> I get the same results on a serversystem without powermanagment activated.
> 
> @Dongsu Could you confirm those findings?

Yes, I can confirm that.
I just tested with different CPU governor configs, "ondemand" and 
"performance". (qemu-kvm 1.0, AMD Phenom II X4 955)

The result is more or less like yours.
Bandwidth with "performance" gets more than 100% better than that with
"ondemand".
That looks definitely one of the reasons of regressions I experienced.
Actually I had always tested with "ondemand", which I haven't noticed.
 
Good catch, thanks!
Dongsu


> 1. Test on i7 Laptop with Cpu governor "ondemand".
> 
> v0.14.1
> bw=63492KB/s iops=15873
> bw=63221KB/s iops=15805
> 
> v1.0
> bw=36696KB/s iops=9173
> bw=37404KB/s iops=9350
> 
> master
> bw=36396KB/s iops=9099
> bw=34182KB/s iops=8545
> 
> Change the Cpu governor to "performance"
> master
> bw=81756KB/s iops=20393
> bw=81453KB/s iops=20257
> 
> 
> 2. Test on AMD Istanbul without powermanagement activated.
> 
> v0.14.1
> bw=53167KB/s iops=13291
> bw=61386KB/s iops=15346
> 
> v1.0
> bw=43599KB/s iops=10899
> bw=46288KB/s iops=11572
> 
> master
> bw=60678KB/s iops=15169
> bw=62733KB/s iops=15683
> 
> -martin

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
  2012-03-06 12:59         ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-03-06 22:07           ` Reeted
  -1 siblings, 0 replies; 52+ messages in thread
From: Reeted @ 2012-03-06 22:07 UTC (permalink / raw)
  To: Stefan Hajnoczi, Martin Mailand, Dongsu Park; +Cc: kvm, qemu-devel

On 03/06/12 13:59, Stefan Hajnoczi wrote:
> On Mon, Mar 5, 2012 at 4:44 PM, Martin Mailand<martin@tuxadero.com>  wrote:
>> Am 05.03.2012 17:35, schrieb Stefan Hajnoczi:
>>
>>>> 1. Test on i7 Laptop with Cpu governor "ondemand".
>>>>>   v0.14.1
>>>>>   bw=63492KB/s iops=15873
>>>>>   bw=63221KB/s iops=15805
>>>>>
>>>>>   v1.0
>>>>>   bw=36696KB/s iops=9173
>>>>>   bw=37404KB/s iops=9350
>>>>>
>>>>>   master
>>>>>   bw=36396KB/s iops=9099
>>>>>   bw=34182KB/s iops=8545
>>>>>
>>>>>   Change the Cpu governor to "performance"
>>>>>   master
>>>>>   bw=81756KB/s iops=20393
>>>>>   bw=81453KB/s iops=20257
>>> Interesting finding.  Did you show the 0.14.1 results with
>>> "performance" governor?
>>
>>
>> Hi Stefan,
>> all results are with "ondemand" except the one where I changed it to
>> "performance"
>>
>> Do you want a v0.14.1 test with the governor on "performance"?
> Yes, the reason why that would be interesting is because it allows us
> to put the performance gain with master+"performance" into
> perspective.  We could see how much of a change we get.


Me too, I would be interested in seeing 0.14.1 being tested with 
performance governor so to compare it to master with performance 
governor, to make sure that this is not a regression.

BTW, I'll take the opportunity to say that 15.8 or 20.3 k IOPS are very 
low figures compared to what I'd instinctively expect from a 
paravirtualized block driver.
There are now PCIe SSD cards that do 240 k IOPS (e.g. "OCZ RevoDrive 3 
x2 max iops") which is 12-15 times higher, for something that has to go 
through a real driver and a real PCI-express bus, and can't use 
zero-copy techniques.
The IOPS we can give to a VM is currently less than half that of a 
single SSD SATA drive (60 k IOPS or so, these days).
That's why I consider this topic of virtio-blk performances very 
important. I hope there can be improvements in this sector...

Thanks for your time
R.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-06 22:07           ` Reeted
  0 siblings, 0 replies; 52+ messages in thread
From: Reeted @ 2012-03-06 22:07 UTC (permalink / raw)
  To: Stefan Hajnoczi, Martin Mailand, Dongsu Park; +Cc: qemu-devel, kvm

On 03/06/12 13:59, Stefan Hajnoczi wrote:
> On Mon, Mar 5, 2012 at 4:44 PM, Martin Mailand<martin@tuxadero.com>  wrote:
>> Am 05.03.2012 17:35, schrieb Stefan Hajnoczi:
>>
>>>> 1. Test on i7 Laptop with Cpu governor "ondemand".
>>>>>   v0.14.1
>>>>>   bw=63492KB/s iops=15873
>>>>>   bw=63221KB/s iops=15805
>>>>>
>>>>>   v1.0
>>>>>   bw=36696KB/s iops=9173
>>>>>   bw=37404KB/s iops=9350
>>>>>
>>>>>   master
>>>>>   bw=36396KB/s iops=9099
>>>>>   bw=34182KB/s iops=8545
>>>>>
>>>>>   Change the Cpu governor to "performance"
>>>>>   master
>>>>>   bw=81756KB/s iops=20393
>>>>>   bw=81453KB/s iops=20257
>>> Interesting finding.  Did you show the 0.14.1 results with
>>> "performance" governor?
>>
>>
>> Hi Stefan,
>> all results are with "ondemand" except the one where I changed it to
>> "performance"
>>
>> Do you want a v0.14.1 test with the governor on "performance"?
> Yes, the reason why that would be interesting is because it allows us
> to put the performance gain with master+"performance" into
> perspective.  We could see how much of a change we get.


Me too, I would be interested in seeing 0.14.1 being tested with 
performance governor so to compare it to master with performance 
governor, to make sure that this is not a regression.

BTW, I'll take the opportunity to say that 15.8 or 20.3 k IOPS are very 
low figures compared to what I'd instinctively expect from a 
paravirtualized block driver.
There are now PCIe SSD cards that do 240 k IOPS (e.g. "OCZ RevoDrive 3 
x2 max iops") which is 12-15 times higher, for something that has to go 
through a real driver and a real PCI-express bus, and can't use 
zero-copy techniques.
The IOPS we can give to a VM is currently less than half that of a 
single SSD SATA drive (60 k IOPS or so, these days).
That's why I consider this topic of virtio-blk performances very 
important. I hope there can be improvements in this sector...

Thanks for your time
R.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
  2012-03-06 22:07           ` Reeted
  (?)
@ 2012-03-07  8:04           ` Stefan Hajnoczi
  2012-03-07 14:21             ` Reeted
  -1 siblings, 1 reply; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-03-07  8:04 UTC (permalink / raw)
  To: Reeted; +Cc: Martin Mailand, Dongsu Park, kvm, qemu-devel

On Tue, Mar 6, 2012 at 10:07 PM, Reeted <reeted@shiftmail.org> wrote:
> On 03/06/12 13:59, Stefan Hajnoczi wrote:
>>
>> On Mon, Mar 5, 2012 at 4:44 PM, Martin Mailand<martin@tuxadero.com>
>>  wrote:
>>>
>>> Am 05.03.2012 17:35, schrieb Stefan Hajnoczi:
>>>
>>>>> 1. Test on i7 Laptop with Cpu governor "ondemand".
>>>>>>
>>>>>>  v0.14.1
>>>>>>  bw=63492KB/s iops=15873
>>>>>>  bw=63221KB/s iops=15805
>>>>>>
>>>>>>  v1.0
>>>>>>  bw=36696KB/s iops=9173
>>>>>>  bw=37404KB/s iops=9350
>>>>>>
>>>>>>  master
>>>>>>  bw=36396KB/s iops=9099
>>>>>>  bw=34182KB/s iops=8545
>>>>>>
>>>>>>  Change the Cpu governor to "performance"
>>>>>>  master
>>>>>>  bw=81756KB/s iops=20393
>>>>>>  bw=81453KB/s iops=20257
>>>>
>>>> Interesting finding.  Did you show the 0.14.1 results with
>>>> "performance" governor?
>>>
>>>
>>>
>>> Hi Stefan,
>>> all results are with "ondemand" except the one where I changed it to
>>> "performance"
>>>
>>> Do you want a v0.14.1 test with the governor on "performance"?
>>
>> Yes, the reason why that would be interesting is because it allows us
>> to put the performance gain with master+"performance" into
>> perspective.  We could see how much of a change we get.
>
>
>
> Me too, I would be interested in seeing 0.14.1 being tested with performance
> governor so to compare it to master with performance governor, to make sure
> that this is not a regression.
>
> BTW, I'll take the opportunity to say that 15.8 or 20.3 k IOPS are very low
> figures compared to what I'd instinctively expect from a paravirtualized
> block driver.
> There are now PCIe SSD cards that do 240 k IOPS (e.g. "OCZ RevoDrive 3 x2
> max iops") which is 12-15 times higher, for something that has to go through
> a real driver and a real PCI-express bus, and can't use zero-copy
> techniques.
> The IOPS we can give to a VM is currently less than half that of a single
> SSD SATA drive (60 k IOPS or so, these days).
> That's why I consider this topic of virtio-blk performances very important.
> I hope there can be improvements in this sector...

It depends on the benchmark configuration.  virtio-blk is capable of
doing 100,000s of iops, I've seen results.  My guess is that you can
do >100,000 read iops with virtio-blk on a good machine and stock
qemu-kvm.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-03-06 12:59         ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-03-07 10:39           ` Martin Mailand
  -1 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-03-07 10:39 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kvm, reeted

Am 06.03.2012 13:59, schrieb Stefan Hajnoczi:
> Yes, the reason why that would be interesting is because it allows us
> to put the performance gain with master+"performance" into
> perspective.  We could see how much of a change we get.
>
> Does the CPU governor also affect the result when you benchmark with
> real disks instead of ramdisk?  I can see how the governor would
> affect ramdisk, but would expect real disk I/O to be impacted much
> less.


Hi,
here my results.
I tested with "fio -name iops -rw=read -size=1G -iodepth 1 -filename 
/dev/vdb -ioengine libaio -direct 1 -bs 4k"

The qemu command was.

qemu-system-x86_64 --enable-kvm -m 512 -boot c \
-drive file=/home/martin/vmware/bisect_kvm/hda.img,cache=none,if=virtio 
-drive file=/dev/ram0,cache=none,if=virtio
-drive file=/dev/sda2,cache=none,if=virtio

Host Kernel 3.3.0+rc4
Guest Kernel 3.0.0-16-generic ubuntu kernel

On the host I use a raw partition sda2 for the disk test, in qemu I 
write with fio to /dev/vdc, though there is no fs involved.
The host disk can at max. 13K iops, in qemu I get at max 6,5K iops, 
that's around about 50% overhead. All the test were with 4k reads, so I 
think we are mostly latency bound.

-martin

log:

** v0.14.1 ondemand **

ram
bw=61038KB/s iops=15259
bw=66190KB/s iops=16547

disk
bw=18105KB/s iops=4526
bw=17625KB/s iops=4406

** v0.14.1 performance **

ram
bw=72356KB/s iops=18088
bw=72390KB/s iops=18097

disk
bw=27886KB/s iops=6971
bw=27915KB/s iops=6978


** master ondemand  **

ram
bw=24833KB/s iops=6208
bw=27275KB/s iops=6818

disk
bw=14980KB/s iops=3745
bw=14881KB/s iops=3720

** master performance **

ram
bw=64318KB/s iops=16079
bw=63523KB/s iops=15880

disk
bw=27043KB/s iops=6760
bw=27211KB/s iops=6802


Host Disk Test (SanDisk SSD U100)

host disk ondemand
bw=48823KB/s iops=12205
bw=49086KB/s iops=12271

host disk performance
bw=55156KB/s iops=13789
bw=54980KB/s iops=13744


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-07 10:39           ` Martin Mailand
  0 siblings, 0 replies; 52+ messages in thread
From: Martin Mailand @ 2012-03-07 10:39 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: reeted, qemu-devel, kvm

Am 06.03.2012 13:59, schrieb Stefan Hajnoczi:
> Yes, the reason why that would be interesting is because it allows us
> to put the performance gain with master+"performance" into
> perspective.  We could see how much of a change we get.
>
> Does the CPU governor also affect the result when you benchmark with
> real disks instead of ramdisk?  I can see how the governor would
> affect ramdisk, but would expect real disk I/O to be impacted much
> less.


Hi,
here my results.
I tested with "fio -name iops -rw=read -size=1G -iodepth 1 -filename 
/dev/vdb -ioengine libaio -direct 1 -bs 4k"

The qemu command was.

qemu-system-x86_64 --enable-kvm -m 512 -boot c \
-drive file=/home/martin/vmware/bisect_kvm/hda.img,cache=none,if=virtio 
-drive file=/dev/ram0,cache=none,if=virtio
-drive file=/dev/sda2,cache=none,if=virtio

Host Kernel 3.3.0+rc4
Guest Kernel 3.0.0-16-generic ubuntu kernel

On the host I use a raw partition sda2 for the disk test, in qemu I 
write with fio to /dev/vdc, though there is no fs involved.
The host disk can at max. 13K iops, in qemu I get at max 6,5K iops, 
that's around about 50% overhead. All the test were with 4k reads, so I 
think we are mostly latency bound.

-martin

log:

** v0.14.1 ondemand **

ram
bw=61038KB/s iops=15259
bw=66190KB/s iops=16547

disk
bw=18105KB/s iops=4526
bw=17625KB/s iops=4406

** v0.14.1 performance **

ram
bw=72356KB/s iops=18088
bw=72390KB/s iops=18097

disk
bw=27886KB/s iops=6971
bw=27915KB/s iops=6978


** master ondemand  **

ram
bw=24833KB/s iops=6208
bw=27275KB/s iops=6818

disk
bw=14980KB/s iops=3745
bw=14881KB/s iops=3720

** master performance **

ram
bw=64318KB/s iops=16079
bw=63523KB/s iops=15880

disk
bw=27043KB/s iops=6760
bw=27211KB/s iops=6802


Host Disk Test (SanDisk SSD U100)

host disk ondemand
bw=48823KB/s iops=12205
bw=49086KB/s iops=12271

host disk performance
bw=55156KB/s iops=13789
bw=54980KB/s iops=13744

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: virtio-blk performance regression and qemu-kvm
  2012-03-07 10:39           ` [Qemu-devel] " Martin Mailand
@ 2012-03-07 11:21             ` Paolo Bonzini
  -1 siblings, 0 replies; 52+ messages in thread
From: Paolo Bonzini @ 2012-03-07 11:21 UTC (permalink / raw)
  To: Martin Mailand; +Cc: Stefan Hajnoczi, reeted, qemu-devel, kvm

Il 07/03/2012 11:39, Martin Mailand ha scritto:
> The host disk can at max. 13K iops, in qemu I get at max 6,5K iops,
> that's around about 50% overhead. All the test were with 4k reads, so I
> think we are mostly latency bound.

For latency tests, running without ioeventfd could give slightly better
results (-global virtio-blk-pci.ioeventfd=off).

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-07 11:21             ` Paolo Bonzini
  0 siblings, 0 replies; 52+ messages in thread
From: Paolo Bonzini @ 2012-03-07 11:21 UTC (permalink / raw)
  To: Martin Mailand; +Cc: Stefan Hajnoczi, reeted, qemu-devel, kvm

Il 07/03/2012 11:39, Martin Mailand ha scritto:
> The host disk can at max. 13K iops, in qemu I get at max 6,5K iops,
> that's around about 50% overhead. All the test were with 4k reads, so I
> think we are mostly latency bound.

For latency tests, running without ioeventfd could give slightly better
results (-global virtio-blk-pci.ioeventfd=off).

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
  2012-03-07  8:04           ` Stefan Hajnoczi
@ 2012-03-07 14:21             ` Reeted
  2012-03-07 14:33                 ` Stefan Hajnoczi
  0 siblings, 1 reply; 52+ messages in thread
From: Reeted @ 2012-03-07 14:21 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Martin Mailand, Dongsu Park, kvm, qemu-devel

On 03/07/12 09:04, Stefan Hajnoczi wrote:
> On Tue, Mar 6, 2012 at 10:07 PM, Reeted<reeted@shiftmail.org>  wrote:
>> On 03/06/12 13:59, Stefan Hajnoczi wrote:
>>> BTW, I'll take the opportunity to say that 15.8 or 20.3 k IOPS are very low
>>> figures compared to what I'd instinctively expect from a paravirtualized
>>> block driver.
>>> There are now PCIe SSD cards that do 240 k IOPS (e.g. "OCZ RevoDrive 3 x2
>>> max iops") which is 12-15 times higher, for something that has to go through
>>> a real driver and a real PCI-express bus, and can't use zero-copy
>>> techniques.
>>> The IOPS we can give to a VM is currently less than half that of a single
>>> SSD SATA drive (60 k IOPS or so, these days).
>>> That's why I consider this topic of virtio-blk performances very important.
>>> I hope there can be improvements in this sector...
> It depends on the benchmark configuration.  virtio-blk is capable of
> doing 100,000s of iops, I've seen results.  My guess is that you can
> do>100,000 read iops with virtio-blk on a good machine and stock
> qemu-kvm.

It's very difficult to configure, then.
I also did benchmarks in the past, and I can confirm Martin and Dongsu 
findings of about 15 k IOPS with:
qemu-kvm 0.14.1, Intel Westmere CPU, virtio-blk (kernel 2.6.38 on the 
guest, 3.0 on the host), fio, 4k random *reads* from the Host page cache 
(backend LVM device was fully in cache on the Host), writeback setting, 
cache dropped on the guest prior to benchmark (and insufficient guest 
memory to cache a significant portion of the device).
If you can teach us how to reach 100 k IOPS, I think everyone would be 
grateful :-)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
  2012-03-07 14:21             ` Reeted
@ 2012-03-07 14:33                 ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-03-07 14:33 UTC (permalink / raw)
  To: Reeted; +Cc: Martin Mailand, Dongsu Park, kvm, qemu-devel, Khoa Huynh

On Wed, Mar 7, 2012 at 2:21 PM, Reeted <reeted@shiftmail.org> wrote:
> On 03/07/12 09:04, Stefan Hajnoczi wrote:
>>
>> On Tue, Mar 6, 2012 at 10:07 PM, Reeted<reeted@shiftmail.org>  wrote:
>>>
>>> On 03/06/12 13:59, Stefan Hajnoczi wrote:
>>>>
>>>> BTW, I'll take the opportunity to say that 15.8 or 20.3 k IOPS are very
>>>> low
>>>> figures compared to what I'd instinctively expect from a paravirtualized
>>>> block driver.
>>>> There are now PCIe SSD cards that do 240 k IOPS (e.g. "OCZ RevoDrive 3
>>>> x2
>>>> max iops") which is 12-15 times higher, for something that has to go
>>>> through
>>>> a real driver and a real PCI-express bus, and can't use zero-copy
>>>> techniques.
>>>> The IOPS we can give to a VM is currently less than half that of a
>>>> single
>>>> SSD SATA drive (60 k IOPS or so, these days).
>>>> That's why I consider this topic of virtio-blk performances very
>>>> important.
>>>> I hope there can be improvements in this sector...
>>
>> It depends on the benchmark configuration.  virtio-blk is capable of
>> doing 100,000s of iops, I've seen results.  My guess is that you can
>> do>100,000 read iops with virtio-blk on a good machine and stock
>> qemu-kvm.
>
>
> It's very difficult to configure, then.
> I also did benchmarks in the past, and I can confirm Martin and Dongsu
> findings of about 15 k IOPS with:
> qemu-kvm 0.14.1, Intel Westmere CPU, virtio-blk (kernel 2.6.38 on the guest,
> 3.0 on the host), fio, 4k random *reads* from the Host page cache (backend
> LVM device was fully in cache on the Host), writeback setting, cache dropped
> on the guest prior to benchmark (and insufficient guest memory to cache a
> significant portion of the device).
> If you can teach us how to reach 100 k IOPS, I think everyone would be
> grateful :-)

Sorry for being vague, I don't have the details.  I have CCed Khoa,
who might have time to describe a >100,000 iops virtio-blk
configuration.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
@ 2012-03-07 14:33                 ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2012-03-07 14:33 UTC (permalink / raw)
  To: Reeted; +Cc: Khoa Huynh, Martin Mailand, Dongsu Park, kvm, qemu-devel

On Wed, Mar 7, 2012 at 2:21 PM, Reeted <reeted@shiftmail.org> wrote:
> On 03/07/12 09:04, Stefan Hajnoczi wrote:
>>
>> On Tue, Mar 6, 2012 at 10:07 PM, Reeted<reeted@shiftmail.org>  wrote:
>>>
>>> On 03/06/12 13:59, Stefan Hajnoczi wrote:
>>>>
>>>> BTW, I'll take the opportunity to say that 15.8 or 20.3 k IOPS are very
>>>> low
>>>> figures compared to what I'd instinctively expect from a paravirtualized
>>>> block driver.
>>>> There are now PCIe SSD cards that do 240 k IOPS (e.g. "OCZ RevoDrive 3
>>>> x2
>>>> max iops") which is 12-15 times higher, for something that has to go
>>>> through
>>>> a real driver and a real PCI-express bus, and can't use zero-copy
>>>> techniques.
>>>> The IOPS we can give to a VM is currently less than half that of a
>>>> single
>>>> SSD SATA drive (60 k IOPS or so, these days).
>>>> That's why I consider this topic of virtio-blk performances very
>>>> important.
>>>> I hope there can be improvements in this sector...
>>
>> It depends on the benchmark configuration.  virtio-blk is capable of
>> doing 100,000s of iops, I've seen results.  My guess is that you can
>> do>100,000 read iops with virtio-blk on a good machine and stock
>> qemu-kvm.
>
>
> It's very difficult to configure, then.
> I also did benchmarks in the past, and I can confirm Martin and Dongsu
> findings of about 15 k IOPS with:
> qemu-kvm 0.14.1, Intel Westmere CPU, virtio-blk (kernel 2.6.38 on the guest,
> 3.0 on the host), fio, 4k random *reads* from the Host page cache (backend
> LVM device was fully in cache on the Host), writeback setting, cache dropped
> on the guest prior to benchmark (and insufficient guest memory to cache a
> significant portion of the device).
> If you can teach us how to reach 100 k IOPS, I think everyone would be
> grateful :-)

Sorry for being vague, I don't have the details.  I have CCed Khoa,
who might have time to describe a >100,000 iops virtio-blk
configuration.

Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2012-03-07 14:33 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-10 14:36 virtio-blk performance regression and qemu-kvm Dongsu Park
2012-02-10 14:36 ` [Qemu-devel] " Dongsu Park
2012-02-12 23:55 ` Rusty Russell
2012-02-12 23:55   ` [Qemu-devel] " Rusty Russell
2012-02-21 16:45   ` Dongsu Park
2012-02-21 16:45     ` [Qemu-devel] " Dongsu Park
2012-02-21 22:16     ` Rusty Russell
2012-02-21 22:16       ` [Qemu-devel] " Rusty Russell
2012-02-13 11:57 ` Stefan Hajnoczi
2012-02-13 11:57   ` [Qemu-devel] " Stefan Hajnoczi
2012-02-21 15:57   ` Dongsu Park
2012-02-21 15:57     ` [Qemu-devel] " Dongsu Park
2012-02-21 17:27     ` Stefan Hajnoczi
2012-02-21 17:27       ` [Qemu-devel] " Stefan Hajnoczi
2012-02-22 16:48       ` Dongsu Park
2012-02-22 16:48         ` [Qemu-devel] " Dongsu Park
2012-02-22 19:53         ` Stefan Hajnoczi
2012-02-22 19:53           ` [Qemu-devel] " Stefan Hajnoczi
2012-02-28 16:39           ` Martin Mailand
2012-02-28 16:39             ` [Qemu-devel] " Martin Mailand
2012-02-28 17:05             ` Stefan Hajnoczi
2012-02-28 17:05               ` [Qemu-devel] " Stefan Hajnoczi
2012-02-28 17:15               ` Martin Mailand
2012-02-28 17:15                 ` [Qemu-devel] " Martin Mailand
2012-02-29  8:38                 ` Stefan Hajnoczi
2012-02-29  8:38                   ` [Qemu-devel] " Stefan Hajnoczi
2012-02-29 13:12                   ` Martin Mailand
2012-02-29 13:12                     ` [Qemu-devel] " Martin Mailand
2012-02-29 13:44                     ` Stefan Hajnoczi
2012-02-29 13:44                       ` [Qemu-devel] " Stefan Hajnoczi
2012-02-29 13:52                       ` Stefan Hajnoczi
2012-02-29 13:52                         ` [Qemu-devel] " Stefan Hajnoczi
2012-03-05 16:13 ` Martin Mailand
2012-03-05 16:13   ` [Qemu-devel] " Martin Mailand
2012-03-05 16:35   ` Stefan Hajnoczi
2012-03-05 16:35     ` [Qemu-devel] " Stefan Hajnoczi
2012-03-05 16:44     ` Martin Mailand
2012-03-05 16:44       ` [Qemu-devel] " Martin Mailand
2012-03-06 12:59       ` Stefan Hajnoczi
2012-03-06 12:59         ` [Qemu-devel] " Stefan Hajnoczi
2012-03-06 22:07         ` Reeted
2012-03-06 22:07           ` Reeted
2012-03-07  8:04           ` Stefan Hajnoczi
2012-03-07 14:21             ` Reeted
2012-03-07 14:33               ` Stefan Hajnoczi
2012-03-07 14:33                 ` Stefan Hajnoczi
2012-03-07 10:39         ` Martin Mailand
2012-03-07 10:39           ` [Qemu-devel] " Martin Mailand
2012-03-07 11:21           ` Paolo Bonzini
2012-03-07 11:21             ` [Qemu-devel] " Paolo Bonzini
2012-03-06 14:32   ` Dongsu Park
2012-03-06 14:32     ` [Qemu-devel] " Dongsu Park

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.