All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris J Arges <1100843@bugs.launchpad.net>
To: qemu-devel@nongnu.org
Subject: [Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues
Date: Mon, 07 Oct 2013 20:33:27 -0000	[thread overview]
Message-ID: <20131007203328.5454.74010.launchpad@soybean.canonical.com> (raw)
In-Reply-To: 20130117163740.7157.55600.malonedeb@gac.canonical.com

** Description changed:

  SRU Justification
  [Impact]
   * Users of QEMU that save their memory states using savevm/loadvm or migrate experience worse performance after the migration/loadvm. To workaround these issues VMs must be completely rebooted. Optimally we should be able to restore a VM's memory state an expect no performance issue.
  
  [Test Case]
  
   * savevm/loadvm:
     - Create a VM and install a test suite such as lmbench.
     - Get numbers right after boot and record them.
     - Open up the qemu monitor and type the following:
       stop
       savevm 0
       loadvm 0
       c
     - Measure performance and record numbers.
     - Compare if numbers are within margin of error.
   * migrate:
     - Create VM, install lmbench, get numbers.
     - Open up qemu monitor and type the following:
       stop
       migrate "exec:dd of=~/save.vm"
       quit
     - Start a new VM using qemu but add the following argument:
       -incoming "exec:dd if=~/save.vm"
     - Run performance test and compare.
  
   If performance measured is similar then we pass the test case.
  
  [Regression Potential]
  
   * The fix is a backport of two upstream patches:
  ad0b5321f1f797274603ebbe20108b0750baee94
  211ea74022f51164a7729030b28eec90b6c99a08
  
- On patch allows QEMU to use THP if its enabled.
+ One patch allows QEMU to use THP if its enabled.
  The other patch changes logic to not memset pages to zero when loading memory for the vm (on an incoming migration).
  
-  * I've also run the qa-regression-testing test-qemu.py script and it passes all tests.
+  * I've also run the qa-regression-testing test-qemu.py script and it
+ passes all tests.
+ 
+ [Additional Information]
+ 
+ Kernels from 3.2 onwards are affected, and all have the config:
+ CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y. Therefore enabling THP is
+ applicable.
+ 
  --
  
  I have 2 physical hosts running Ubuntu Precise.  With 1.0+noroms-
  0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal,
  built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs
  from source to test, but libvirt seems to have an issue with it that I
  haven't been able to track down yet.
  
   I'm seeing a performance degradation after live migration on Precise,
  but not Lucid.  These hosts are managed by libvirt (tested both
  0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula.  I
  don't seem to have this problem with lucid guests (running a number of
  standard kernels, 3.2.5 mainline and backported linux-
  image-3.2.0-35-generic as well.)
  
  I first noticed this problem with phoronix doing compilation tests, and
  then tried lmbench where even simple calls experience performance
  degradation.
  
  I've attempted to post to the kvm mailing list, but so far the only
  suggestion was it may be related to transparent hugepages not being used
  after migration, but this didn't pan out.  Someone else has a similar
  problem here -
  http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
  
  qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu
  Westmere -enable-kvm -m 73728 -smp 16,sockets=2,cores=8,threads=1 -uuid
  f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults
  -chardev
  socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-2.monitor,server,nowait
  -mon chardev=charmonitor,id=monitor,mode=control -rtc
  base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device
  piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
  file=/var/lib/one//datastores/0/2/disk.0,if=none,id=drive-virtio-
  disk0,format=raw,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
  disk0,bootindex=1 -drive
  file=/var/lib/one//datastores/0/2/disk.1,if=none,id=drive-
  ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive
  =drive-ide0-0-0,id=ide0-0-0 -netdev
  tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-
  pci,netdev=hostnet0,id=net0,mac=02:00:0a:64:02:fe,bus=pci.0,addr=0x3
  -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155 -device
  virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
  
  Disk backend is LVM running on SAN via FC connection (using symlink from
  /var/lib/one/datastores/0/2/disk.0 above)
  
  ubuntu-12.04 - first boot
  ==========================================
  Simple syscall: 0.0527 microseconds
  Simple read: 0.1143 microseconds
  Simple write: 0.0953 microseconds
  Simple open/close: 1.0432 microseconds
  
  Using phoronix pts/compuational
  ImageMagick - 31.54s
  Linux Kernel 3.1 - 43.91s
  Mplayer - 30.49s
  PHP - 22.25s
  
  ubuntu-12.04 - post live migration
  ==========================================
  Simple syscall: 0.0621 microseconds
  Simple read: 0.2485 microseconds
  Simple write: 0.2252 microseconds
  Simple open/close: 1.4626 microseconds
  
  Using phoronix pts/compilation
  ImageMagick - 43.29s
  Linux Kernel 3.1 - 76.67s
  Mplayer - 45.41s
  PHP - 29.1s
  
  I don't have phoronix results for 10.04 handy, but they were within 1%
  of each other...
  
  ubuntu-10.04 - first boot
  ==========================================
  Simple syscall: 0.0524 microseconds
  Simple read: 0.1135 microseconds
  Simple write: 0.0972 microseconds
  Simple open/close: 1.1261 microseconds
  
  ubuntu-10.04 - post live migration
  ==========================================
  Simple syscall: 0.0526 microseconds
  Simple read: 0.1075 microseconds
  Simple write: 0.0951 microseconds
  Simple open/close: 1.0413 microseconds

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1100843

Title:
  Live Migration Causes Performance Issues

Status in QEMU:
  New
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “qemu-kvm” source package in Precise:
  In Progress
Status in “qemu-kvm” source package in Quantal:
  Triaged
Status in “qemu-kvm” source package in Raring:
  Triaged
Status in “qemu-kvm” source package in Saucy:
  Fix Released

Bug description:
  SRU Justification
  [Impact]
   * Users of QEMU that save their memory states using savevm/loadvm or migrate experience worse performance after the migration/loadvm. To workaround these issues VMs must be completely rebooted. Optimally we should be able to restore a VM's memory state an expect no performance issue.

  [Test Case]

   * savevm/loadvm:
     - Create a VM and install a test suite such as lmbench.
     - Get numbers right after boot and record them.
     - Open up the qemu monitor and type the following:
       stop
       savevm 0
       loadvm 0
       c
     - Measure performance and record numbers.
     - Compare if numbers are within margin of error.
   * migrate:
     - Create VM, install lmbench, get numbers.
     - Open up qemu monitor and type the following:
       stop
       migrate "exec:dd of=~/save.vm"
       quit
     - Start a new VM using qemu but add the following argument:
       -incoming "exec:dd if=~/save.vm"
     - Run performance test and compare.

   If performance measured is similar then we pass the test case.

  [Regression Potential]

   * The fix is a backport of two upstream patches:
  ad0b5321f1f797274603ebbe20108b0750baee94
  211ea74022f51164a7729030b28eec90b6c99a08

  One patch allows QEMU to use THP if its enabled.
  The other patch changes logic to not memset pages to zero when loading memory for the vm (on an incoming migration).

   * I've also run the qa-regression-testing test-qemu.py script and it
  passes all tests.

  [Additional Information]

  Kernels from 3.2 onwards are affected, and all have the config:
  CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y. Therefore enabling THP is
  applicable.

  --

  I have 2 physical hosts running Ubuntu Precise.  With 1.0+noroms-
  0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal,
  built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs
  from source to test, but libvirt seems to have an issue with it that I
  haven't been able to track down yet.

   I'm seeing a performance degradation after live migration on Precise,
  but not Lucid.  These hosts are managed by libvirt (tested both
  0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula.  I
  don't seem to have this problem with lucid guests (running a number of
  standard kernels, 3.2.5 mainline and backported linux-
  image-3.2.0-35-generic as well.)

  I first noticed this problem with phoronix doing compilation tests,
  and then tried lmbench where even simple calls experience performance
  degradation.

  I've attempted to post to the kvm mailing list, but so far the only
  suggestion was it may be related to transparent hugepages not being
  used after migration, but this didn't pan out.  Someone else has a
  similar problem here -
  http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592

  qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu
  Westmere -enable-kvm -m 73728 -smp 16,sockets=2,cores=8,threads=1
  -uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults
  -chardev
  socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-2.monitor,server,nowait
  -mon chardev=charmonitor,id=monitor,mode=control -rtc
  base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device
  piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
  file=/var/lib/one//datastores/0/2/disk.0,if=none,id=drive-virtio-
  disk0,format=raw,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
  disk0,bootindex=1 -drive
  file=/var/lib/one//datastores/0/2/disk.1,if=none,id=drive-
  ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive
  =drive-ide0-0-0,id=ide0-0-0 -netdev
  tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-
  pci,netdev=hostnet0,id=net0,mac=02:00:0a:64:02:fe,bus=pci.0,addr=0x3
  -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155
  -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

  Disk backend is LVM running on SAN via FC connection (using symlink
  from /var/lib/one/datastores/0/2/disk.0 above)

  ubuntu-12.04 - first boot
  ==========================================
  Simple syscall: 0.0527 microseconds
  Simple read: 0.1143 microseconds
  Simple write: 0.0953 microseconds
  Simple open/close: 1.0432 microseconds

  Using phoronix pts/compuational
  ImageMagick - 31.54s
  Linux Kernel 3.1 - 43.91s
  Mplayer - 30.49s
  PHP - 22.25s

  ubuntu-12.04 - post live migration
  ==========================================
  Simple syscall: 0.0621 microseconds
  Simple read: 0.2485 microseconds
  Simple write: 0.2252 microseconds
  Simple open/close: 1.4626 microseconds

  Using phoronix pts/compilation
  ImageMagick - 43.29s
  Linux Kernel 3.1 - 76.67s
  Mplayer - 45.41s
  PHP - 29.1s

  I don't have phoronix results for 10.04 handy, but they were within 1%
  of each other...

  ubuntu-10.04 - first boot
  ==========================================
  Simple syscall: 0.0524 microseconds
  Simple read: 0.1135 microseconds
  Simple write: 0.0972 microseconds
  Simple open/close: 1.1261 microseconds

  ubuntu-10.04 - post live migration
  ==========================================
  Simple syscall: 0.0526 microseconds
  Simple read: 0.1075 microseconds
  Simple write: 0.0951 microseconds
  Simple open/close: 1.0413 microseconds

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1100843/+subscriptions

  parent reply	other threads:[~2013-10-07 20:41 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20130117163740.7157.55600.malonedeb@gac.canonical.com>
2013-04-16 12:15 ` [Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues Serge Hallyn
2013-04-16 17:07 ` Paolo Bonzini
2013-04-16 19:35 ` C Cormier
2013-05-01  0:38 ` Jonathan Jefferson
2013-05-01 22:46 ` Jonathan Jefferson
2013-05-08 18:11 ` C Cormier
2013-05-08 18:38 ` Serge Hallyn
2013-05-08 19:00 ` [Qemu-devel] [Bug 1100843] Missing required logs Brad Figg
2013-05-09 15:51 ` [Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues Jonathan Jefferson
2013-05-24 17:11 ` Paolo Bonzini
2013-05-24 17:18 ` Paolo Bonzini
2013-05-24 17:40 ` C Cormier
2013-07-08 18:28 ` Fletcher Kubota
2013-09-02  9:21 ` Stephen Gran
2013-09-08  9:40 ` Stephen Gran
2013-09-25  3:56 ` Chris J Arges
2013-09-26 14:04 ` Chris J Arges
2013-09-26 20:33 ` Chris J Arges
2013-10-06 13:57   ` Zhang Haoyu
2013-10-07  6:38     ` Peter Lieven
2013-10-07  6:38       ` [Qemu-devel] " Peter Lieven
2013-10-07  9:37       ` Paolo Bonzini
2013-10-07  9:37         ` Paolo Bonzini
2013-10-07  9:49         ` Peter Lieven
2013-10-07  9:49           ` Peter Lieven
2013-10-07  9:55           ` Paolo Bonzini
2013-10-07  9:55             ` Paolo Bonzini
2013-10-10  8:17             ` Peter Lieven
2013-10-10  8:17               ` [Qemu-devel] " Peter Lieven
2013-10-07 13:47 ` Chris J Arges
2013-10-07 13:59 ` Chris J Arges
2013-10-07 14:18 ` Chris J Arges
2013-10-07 20:33 ` Chris J Arges [this message]
2013-10-10 22:25 ` Brian Murray
2013-10-11 13:16 ` Chris J Arges
2013-10-24 17:54 ` Launchpad Bug Tracker
2013-10-24 17:54 ` [Qemu-devel] [Bug 1100843] Update Released Brian Murray
2013-11-08 22:21 ` [Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues Chris J Arges
2013-11-27 13:00 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131007203328.5454.74010.launchpad@soybean.canonical.com \
    --to=1100843@bugs.launchpad.net \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.