From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55306) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VTBSE-00008z-Ip for qemu-devel@nongnu.org; Mon, 07 Oct 2013 10:06:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VTBS9-0005GD-38 for qemu-devel@nongnu.org; Mon, 07 Oct 2013 10:06:34 -0400 Received: from indium.canonical.com ([91.189.90.7]:34419) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VTBS8-0005G5-QJ for qemu-devel@nongnu.org; Mon, 07 Oct 2013 10:06:29 -0400 Received: from loganberry.canonical.com ([91.189.90.37]) by indium.canonical.com with esmtp (Exim 4.71 #1 (Debian)) id 1VTBS8-0007Rx-9H for ; Mon, 07 Oct 2013 14:06:28 +0000 Received: from loganberry.canonical.com (localhost [127.0.0.1]) by loganberry.canonical.com (Postfix) with ESMTP id 43EB22E807F for ; Mon, 7 Oct 2013 14:06:28 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Date: Mon, 07 Oct 2013 13:59:21 -0000 From: Chris J Arges <1100843@bugs.launchpad.net> Sender: bounces@canonical.com References: <20130117163740.7157.55600.malonedeb@gac.canonical.com> Message-Id: <20131007135922.27411.71682.launchpad@wampee.canonical.com> Errors-To: bounces@canonical.com Subject: [Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues Reply-To: Bug 1100843 <1100843@bugs.launchpad.net> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org ** Description changed: + SRU Justification + [Impact] = + * Users of QEMU that save their memory states using savevm/loadvm or mig= rate experience worse performance after the migration/loadvm. To workaround= these issues VMs must be completely rebooted. Optimally we should be able = to restore a VM's memory state an expect no performance issue. = + = + [Test Case] + = + * savevm/loadvm: + - Create a VM and install a test suite such as lmbench. + - Get numbers right after boot and record them. + - Open up the qemu monitor and type the following: + stop + savevm 0 + loadvm 0 + c + - Measure performance and record numbers. + - Compare if numbers are within margin of error. + * migrate: + - Create VM, install lmbench, get numbers. + - Open up qemu monitor and type the following: + stop + migrate "exec:dd of=3D~/save.vm" + quit + - Start a new VM using qemu but add the following argument: + -incoming "exec:dd if=3D~/save.vm" + - Run performance test and compare. + = + If performance measured is similar then we pass the test case. = + = + [Regression Potential] + = + * The fix is a backport of two upstream patches: + ad0b5321f1f797274603ebbe20108b0750baee94 + 211ea74022f51164a7729030b28eec90b6c99a08 + = + On patch allows QEMU to use THP if its enabled. + The other patch changes logic to not memset pages to zero when loading me= mory for the vm (on an incoming migration). + = + -- + = I have 2 physical hosts running Ubuntu Precise. With 1.0+noroms- 0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal, built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs from source to test, but libvirt seems to have an issue with it that I haven't been able to track down yet. = - I'm seeing a performance degradation after live migration on Precise, + =C2=A0I'm seeing a performance degradation after live migration on Precis= e, but not Lucid. These hosts are managed by libvirt (tested both 0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula. I don't seem to have this problem with lucid guests (running a number of standard kernels, 3.2.5 mainline and backported linux- image-3.2.0-35-generic as well.) = I first noticed this problem with phoronix doing compilation tests, and then tried lmbench where even simple calls experience performance degradation. = I've attempted to post to the kvm mailing list, but so far the only suggestion was it may be related to transparent hugepages not being used after migration, but this didn't pan out. Someone else has a similar problem here - http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 = qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu Westmere -enable-kvm -m 73728 -smp 16,sockets=3D2,cores=3D8,threads=3D1 -= uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults -chardev socket,id=3Dcharmonitor,path=3D/var/lib/libvirt/qemu/one-2.monitor,server= ,nowait -mon chardev=3Dcharmonitor,id=3Dmonitor,mode=3Dcontrol -rtc base=3Dutc,driftfix=3Dslew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=3Dusb,bus=3Dpci.0,addr=3D0x1.0x2 -drive file=3D/var/lib/one//datastores/0/2/disk.0,if=3Dnone,id=3Ddrive-virtio- disk0,format=3Draw,cache=3Dnone -device virtio-blk- pci,scsi=3Doff,bus=3Dpci.0,addr=3D0x4,drive=3Ddrive-virtio-disk0,id=3Dvir= tio- disk0,bootindex=3D1 -drive file=3D/var/lib/one//datastores/0/2/disk.1,if=3Dnone,id=3Ddrive- ide0-0-0,readonly=3Don,format=3Draw -device ide-cd,bus=3Dide.0,unit=3D0,d= rive =3Ddrive-ide0-0-0,id=3Dide0-0-0 -netdev tap,fd=3D23,id=3Dhostnet0,vhost=3Don,vhostfd=3D25 -device virtio-net- pci,netdev=3Dhostnet0,id=3Dnet0,mac=3D02:00:0a:64:02:fe,bus=3Dpci.0,addr= =3D0x3 -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155 -device virtio-balloon-pci,id=3Dballoon0,bus=3Dpci.0,addr=3D0x5 = Disk backend is LVM running on SAN via FC connection (using symlink from /var/lib/one/datastores/0/2/disk.0 above) = - = ubuntu-12.04 - first boot =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0527 microseconds Simple read: 0.1143 microseconds Simple write: 0.0953 microseconds Simple open/close: 1.0432 microseconds = Using phoronix pts/compuational ImageMagick - 31.54s Linux Kernel 3.1 - 43.91s Mplayer - 30.49s PHP - 22.25s - = = ubuntu-12.04 - post live migration =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0621 microseconds Simple read: 0.2485 microseconds Simple write: 0.2252 microseconds Simple open/close: 1.4626 microseconds = Using phoronix pts/compilation ImageMagick - 43.29s Linux Kernel 3.1 - 76.67s Mplayer - 45.41s PHP - 29.1s = - = - I don't have phoronix results for 10.04 handy, but they were within 1% of= each other... + I don't have phoronix results for 10.04 handy, but they were within 1% + of each other... = ubuntu-10.04 - first boot =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0524 microseconds Simple read: 0.1135 microseconds Simple write: 0.0972 microseconds Simple open/close: 1.1261 microseconds = - = ubuntu-10.04 - post live migration =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0526 microseconds Simple read: 0.1075 microseconds Simple write: 0.0951 microseconds Simple open/close: 1.0413 microseconds -- = You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1100843 Title: Live Migration Causes Performance Issues Status in QEMU: New Status in =E2=80=9Cqemu-kvm=E2=80=9D package in Ubuntu: Fix Released Status in =E2=80=9Cqemu-kvm=E2=80=9D source package in Precise: In Progress Status in =E2=80=9Cqemu-kvm=E2=80=9D source package in Quantal: Triaged Status in =E2=80=9Cqemu-kvm=E2=80=9D source package in Raring: Triaged Status in =E2=80=9Cqemu-kvm=E2=80=9D source package in Saucy: Fix Released Bug description: SRU Justification [Impact] = * Users of QEMU that save their memory states using savevm/loadvm or mig= rate experience worse performance after the migration/loadvm. To workaround= these issues VMs must be completely rebooted. Optimally we should be able = to restore a VM's memory state an expect no performance issue. = [Test Case] * savevm/loadvm: - Create a VM and install a test suite such as lmbench. - Get numbers right after boot and record them. - Open up the qemu monitor and type the following: stop savevm 0 loadvm 0 c - Measure performance and record numbers. - Compare if numbers are within margin of error. * migrate: - Create VM, install lmbench, get numbers. - Open up qemu monitor and type the following: stop migrate "exec:dd of=3D~/save.vm" quit - Start a new VM using qemu but add the following argument: -incoming "exec:dd if=3D~/save.vm" - Run performance test and compare. = If performance measured is similar then we pass the test case. = [Regression Potential] * The fix is a backport of two upstream patches: ad0b5321f1f797274603ebbe20108b0750baee94 211ea74022f51164a7729030b28eec90b6c99a08 On patch allows QEMU to use THP if its enabled. The other patch changes logic to not memset pages to zero when loading me= mory for the vm (on an incoming migration). -- I have 2 physical hosts running Ubuntu Precise. With 1.0+noroms- 0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal, built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs from source to test, but libvirt seems to have an issue with it that I haven't been able to track down yet. =C2=A0I'm seeing a performance degradation after live migration on Precis= e, but not Lucid. These hosts are managed by libvirt (tested both 0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula. I don't seem to have this problem with lucid guests (running a number of standard kernels, 3.2.5 mainline and backported linux- image-3.2.0-35-generic as well.) I first noticed this problem with phoronix doing compilation tests, and then tried lmbench where even simple calls experience performance degradation. I've attempted to post to the kvm mailing list, but so far the only suggestion was it may be related to transparent hugepages not being used after migration, but this didn't pan out. Someone else has a similar problem here - http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu Westmere -enable-kvm -m 73728 -smp 16,sockets=3D2,cores=3D8,threads=3D1 -uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults -chardev socket,id=3Dcharmonitor,path=3D/var/lib/libvirt/qemu/one-2.monitor,server= ,nowait -mon chardev=3Dcharmonitor,id=3Dmonitor,mode=3Dcontrol -rtc base=3Dutc,driftfix=3Dslew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=3Dusb,bus=3Dpci.0,addr=3D0x1.0x2 -drive file=3D/var/lib/one//datastores/0/2/disk.0,if=3Dnone,id=3Ddrive-virtio- disk0,format=3Draw,cache=3Dnone -device virtio-blk- pci,scsi=3Doff,bus=3Dpci.0,addr=3D0x4,drive=3Ddrive-virtio-disk0,id=3Dvir= tio- disk0,bootindex=3D1 -drive file=3D/var/lib/one//datastores/0/2/disk.1,if=3Dnone,id=3Ddrive- ide0-0-0,readonly=3Don,format=3Draw -device ide-cd,bus=3Dide.0,unit=3D0,d= rive =3Ddrive-ide0-0-0,id=3Dide0-0-0 -netdev tap,fd=3D23,id=3Dhostnet0,vhost=3Don,vhostfd=3D25 -device virtio-net- pci,netdev=3Dhostnet0,id=3Dnet0,mac=3D02:00:0a:64:02:fe,bus=3Dpci.0,addr= =3D0x3 -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155 -device virtio-balloon-pci,id=3Dballoon0,bus=3Dpci.0,addr=3D0x5 Disk backend is LVM running on SAN via FC connection (using symlink from /var/lib/one/datastores/0/2/disk.0 above) ubuntu-12.04 - first boot =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0527 microseconds Simple read: 0.1143 microseconds Simple write: 0.0953 microseconds Simple open/close: 1.0432 microseconds Using phoronix pts/compuational ImageMagick - 31.54s Linux Kernel 3.1 - 43.91s Mplayer - 30.49s PHP - 22.25s ubuntu-12.04 - post live migration =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0621 microseconds Simple read: 0.2485 microseconds Simple write: 0.2252 microseconds Simple open/close: 1.4626 microseconds Using phoronix pts/compilation ImageMagick - 43.29s Linux Kernel 3.1 - 76.67s Mplayer - 45.41s PHP - 29.1s I don't have phoronix results for 10.04 handy, but they were within 1% of each other... ubuntu-10.04 - first boot =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0524 microseconds Simple read: 0.1135 microseconds Simple write: 0.0972 microseconds Simple open/close: 1.1261 microseconds ubuntu-10.04 - post live migration =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Simple syscall: 0.0526 microseconds Simple read: 0.1075 microseconds Simple write: 0.0951 microseconds Simple open/close: 1.0413 microseconds To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1100843/+subscriptions