All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads
@ 2022-05-10 22:42 Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params Juan Quintela
                   ` (13 more replies)
  0 siblings, 14 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

In this version:
- document what protects each field in MultiFDRecv/SendParams
- calcule page_size once when we start the migration, and store it in
  a field
- Same for page_count.
- rebase to latest
- minor improvements here and there
- test on huge memory machines

Command line for all the tests:

gdb -q --ex "run" --args $QEMU \
	-name guest=$NAME,debug-threads=on \
	-m 16G \
	-smp 6 \
	-machine q35,accel=kvm,usb=off,dump-guest-core=off \
	-boot strict=on \
	-cpu host \
	-no-hpet \
	-rtc base=utc,driftfix=slew \
	-global kvm-pit.lost_tick_policy=delay \
	-global ICH9-LPC.disable_s3=1 \
	-global ICH9-LPC.disable_s4=1 \
	-device pcie-root-port,id=root.1,chassis=1,addr=0x2.0,multifunction=on \
	-device pcie-root-port,id=root.2,chassis=2,addr=0x2.1 \
	-device pcie-root-port,id=root.3,chassis=3,addr=0x2.2 \
	-device pcie-root-port,id=root.4,chassis=4,addr=0x2.3 \
	-device pcie-root-port,id=root.5,chassis=5,addr=0x2.4 \
	-device pcie-root-port,id=root.6,chassis=6,addr=0x2.5 \
	-device pcie-root-port,id=root.7,chassis=7,addr=0x2.6 \
	-device pcie-root-port,id=root.8,chassis=8,addr=0x2.7 \
	-blockdev driver=file,node-name=storage0,filename=$FILE,auto-read-only=true,discard=unmap \
	-blockdev driver=qcow2,node-name=format0,read-only=false,file=storage0 \
	-device virtio-blk-pci,id=virtio-disk0,drive=format0,bootindex=1,bus=root.1 \
	-netdev tap,id=hostnet0,vhost=on,script=/etc/kvm-ifup,downscript=/etc/kvm-ifdown \
	-device virtio-net-pci,id=net0,netdev=hostnet0,mac=$MAC,bus=root.2 \
	-device virtio-serial-pci,id=virtio-serial0,bus=root.3 \
	-device virtio-balloon-pci,id=balloon0,bus=root.4 \
	$GRAPHICS \
	$CONSOLE \
	-device virtconsole,id=console0,chardev=charconsole0 \
	-uuid 9d3be7da-e1ff-41a0-ac39-8b2e04de2c19 \
	-nodefaults \
	-msg timestamp=on \
	-no-user-config \
	$MONITOR \
	$TRACE \
	-global migration.x-multifd=on \
	-global migration.multifd-channels=16 \
	-global migration.x-max-bandwidth=$BANDWIDTH

Tests have been done in a single machine over localhost.  I didn't have 2 machines with 4TB of RAM for testing.

Tests done on a 12TB RAM machine.  Guests where running with 16GB, 1TB and 4TB RAM

tests run with:
- upstream multifd
- multifd + zero page
- precopy (only some of them)

tests done:
- idle clean guest (just booted guest)
- idle dirty guest (run a program to dirty all memory)
- test with stress (4 threads each dirtying 1GB RAM)

Executive summary

16GB guest
                Precopy            upstream          zero page
                Time    Downtime   Time    Downtime  Time    Downtime
clean idle      1548     93         1359   48         866    167
dirty idle     16222    220         2092   371       1870    258
busy 4GB       don't converge      31000   308       1604    371

In the dirty idle, there is some weirdness in the precopy case, I
tried several times and it always took too much time.  It should be
faster.

In the busy 4GB case, precopy don't converge (expected) and without
zero page, multifd is on the limit, it _almost_ don't convrge, it took
187 iterations to converge.

1TB
                Precopy            upstream          zero page
                Time    Downtime   Time    Downtime  Time    Downtime
clean idle     83174    381        72075   345       52966   273
dirty idle                        104587   381       75601   269
busy 2GB                           79912   345       58953   348

I only tried the clean idle case with 1TB.  Notice that it is already
significantively slower.  With 1TB RAM, zero page is clearly superior in all tests.

4TB
                upstream          zero page
                Time    Downtime  Time    Downtime
clean idle      317054  552       215567  500
dirty idle      357581  553       317428  744

The busy case here is similar to the 1TB guests, just takes much more time.

In conclusion, zero page detection on the migration threads is from a
bit to much faster than anything else.

I add here the output of info migrate and perf for all the migration
rounds.  The important bit that I found is that once that we introduce
zero pages, migration spends all its time copyng pages, that is where
it needs to be, not waiting for buffer_zero or similar.

Upstream
--------

16GB test

idle

precopy

Migration status: completed
total time: 1548 ms
downtime: 93 ms
setup: 16 ms
transferred ram: 624798 kbytes
throughput: 3343.01 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 4048839 pages
skipped: 0 pages
normal: 147016 pages
normal bytes: 588064 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 0 kbytes
pages-per-second: 651825
precopy ram: 498490 kbytes
downtime ram: 126307 kbytes

  41.76%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
  14.68%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   9.53%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   5.72%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   3.89%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.50%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   2.45%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   1.87%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   1.28%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
   1.03%  live_migration   qemu-system-x86_64       [.] find_next_bit
   0.95%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   0.95%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   0.68%  live_migration   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.67%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
   0.56%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
   0.51%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
   0.43%  live_migration   [kernel.kallsyms]        [k] copy_page
   0.38%  live_migration   qemu-system-x86_64       [.] get_ptr_rcu_reader
   0.36%  live_migration   qemu-system-x86_64       [.] save_page_header
   0.33%  live_migration   [kernel.kallsyms]        [k] __memcg_kmem_charge_page
   0.33%  live_migration   qemu-system-x86_64       [.] runstate_is_running

upstream

Migration status: completed
total time: 1359 ms
downtime: 48 ms
setup: 35 ms
transferred ram: 603701 kbytes
throughput: 3737.66 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 4053362 pages
skipped: 0 pages
normal: 141517 pages
normal bytes: 566068 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 568076 kbytes
pages-per-second: 2039403
precopy ram: 35624 kbytes
downtime ram: 1 kbytes

  36.03%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   9.32%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   5.18%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   4.15%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.60%  live_migration   [kernel.kallsyms]        [k] copy_page
   2.30%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   2.24%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   1.96%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   1.30%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
   1.12%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.00%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.94%  live_migration   qemu-system-x86_64       [.] find_next_bit
   0.93%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   0.91%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.88%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   0.88%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   0.81%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
   0.81%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.79%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.75%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.72%  live_migration   libc.so.6                [.] __pthread_mutex_lock
   0.70%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.70%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
   0.70%  qemu-system-x86  [kernel.kallsyms]        [k] perf_event_alloc
   0.69%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.68%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.67%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.66%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.64%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.63%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.63%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.60%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.53%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.47%  live_migration   qemu-system-x86_64       [.] qemu_put_byte

zero page

Migration status: completed
total time: 866 ms
downtime: 167 ms
setup: 42 ms
transferred ram: 14627983 kbytes
throughput: 145431.53 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 4024050 pages
skipped: 0 pages
normal: 143374 pages
normal bytes: 573496 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 14627983 kbytes
pages-per-second: 4786693
precopy ram: 11033626 kbytes
downtime ram: 3594356 kbytes

   6.84%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   4.06%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   3.46%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.39%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   1.59%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.50%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512
   1.48%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512
   1.32%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512
   1.29%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512
   1.25%  live_migration   qemu-system-x86_64       [.] find_next_bit
   1.24%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.20%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.20%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512
   1.18%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   1.16%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.13%  live_migration   qemu-system-x86_64       [.] multifd_queue_page
   1.08%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512
   1.06%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.94%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.92%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.91%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.90%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string

16GB guest

dirty

precopy

Migration status: completed
total time: 16222 ms
downtime: 220 ms
setup: 18 ms
transferred ram: 15927448 kbytes
throughput: 8052.38 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 222804 pages
skipped: 0 pages
normal: 3973611 pages
normal bytes: 15894444 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 0 kbytes
pages-per-second: 241728
precopy ram: 15670253 kbytes
downtime ram: 257194 kbytes

  38.22%  live_migration   [kernel.kallsyms]        [k] copy_page
  38.04%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.55%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   2.45%  live_migration   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   1.43%  live_migration   [kernel.kallsyms]        [k] free_pcp_prepare
   1.01%  live_migration   [kernel.kallsyms]        [k] _copy_from_iter
   0.79%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   0.79%  live_migration   [kernel.kallsyms]        [k] __list_del_entry_valid
   0.68%  live_migration   [kernel.kallsyms]        [k] check_new_pages
   0.64%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   0.49%  live_migration   [kernel.kallsyms]        [k] skb_release_data
   0.39%  live_migration   [kernel.kallsyms]        [k] __skb_clone
   0.36%  live_migration   [kernel.kallsyms]        [k] total_mapcount
   0.34%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   0.32%  live_migration   [kernel.kallsyms]        [k] __dev_queue_xmit
   0.29%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   0.29%  live_migration   [kernel.kallsyms]        [k] __alloc_skb
   0.27%  live_migration   [kernel.kallsyms]        [k] __ip_queue_xmit
   0.26%  live_migration   [kernel.kallsyms]        [k] copy_user_generic_unrolled
   0.26%  live_migration   [kernel.kallsyms]        [k] __tcp_transmit_skb
   0.24%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   0.24%  live_migration   [kernel.kallsyms]        [k] skb_page_frag_refill

upstream

Migration status: completed
total time: 2092 ms
downtime: 371 ms
setup: 39 ms
transferred ram: 15929157 kbytes
throughput: 63562.98 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 224436 pages
skipped: 0 pages
normal: 3971430 pages
normal bytes: 15885720 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 15927184 kbytes
pages-per-second: 2441771
precopy ram: 1798 kbytes
downtime ram: 174 kbytes

  5.23%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   4.93%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.92%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.84%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.56%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.55%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.53%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.48%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.43%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.43%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.33%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.21%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.19%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.13%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.01%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.86%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.83%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.90%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   0.70%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   0.69%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   0.62%  live_migration   libc.so.6                [.] __pthread_mutex_lock
   0.37%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   0.29%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   0.27%  live_migration   qemu-system-x86_64       [.] multifd_send_pages

zero page

Migration status: completed
total time: 1870 ms
downtime: 258 ms
setup: 36 ms
transferred ram: 16998097 kbytes
throughput: 75927.79 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 222485 pages
skipped: 0 pages
normal: 3915115 pages
normal bytes: 15660460 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 16998097 kbytes
pages-per-second: 2555169
precopy ram: 13929973 kbytes
downtime ram: 3068124 kbytes

   4.66%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.60%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.49%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.39%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.36%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.21%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.20%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.18%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.17%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.07%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.97%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.96%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.89%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.73%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.68%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.44%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.52%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   2.09%  live_migration   libc.so.6                [.] __pthread_mutex_lock
   1.03%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   0.97%  multifdsend_3    [kernel.kallsyms]        [k] copy_page
   0.94%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
   0.79%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl
   0.73%  multifdsend_11   [kernel.kallsyms]        [k] copy_page
   0.70%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
   0.45%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   0.41%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable

16GB guest

stress --vm 4 --vm-bytes 1G --vm-keep

precopy

Don't converge

upstream

Migration status: completed
total time: 31800 ms
downtime: 308 ms
setup: 40 ms
transferred ram: 295540640 kbytes
throughput: 76230.23 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 3006674 pages
skipped: 0 pages
normal: 73686367 pages
normal bytes: 294745468 kbytes
dirty sync count: 187
page size: 4 kbytes
multifd bytes: 295514209 kbytes
pages-per-second: 2118000
precopy ram: 26430 kbytes

  7.79%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   3.86%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.83%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.79%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.72%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.46%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.44%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.38%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.32%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.31%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.22%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.21%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.19%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.07%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.95%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.95%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.77%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.78%  live_migration   [kernel.kallsyms]        [k] kvm_set_pfn_dirty
   1.65%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   0.68%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   0.62%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   0.46%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   0.41%  live_migration   [kernel.kallsyms]        [k] __handle_changed_spte
   0.40%  live_migration   [kernel.kallsyms]        [k] pfn_valid.part.0
   0.37%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
   0.29%  CPU 2/KVM        [kernel.kallsyms]        [k] copy_page
   0.27%  live_migration   [kernel.kallsyms]        [k] clear_dirty_pt_masked
   0.27%  CPU 1/KVM        [kernel.kallsyms]        [k] copy_page
   0.26%  live_migration   [kernel.kallsyms]        [k] tdp_iter_next
   0.25%  CPU 1/KVM        [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.24%  CPU 1/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
   0.24%  CPU 2/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0

Zero page

Migration status: completed
total time: 1604 ms
downtime: 371 ms
setup: 32 ms
transferred ram: 20591268 kbytes
throughput: 107307.14 mbps
remaining ram: 0 kbytes
total ram: 16777992 kbytes
duplicate: 2984825 pages
skipped: 0 pages
normal: 2213496 pages
normal bytes: 8853984 kbytes
dirty sync count: 4
page size: 4 kbytes
multifd bytes: 20591268 kbytes
pages-per-second: 4659200
precopy ram: 15722803 kbytes
downtime ram: 4868465 kbytes

   3.21%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.92%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.86%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.81%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.80%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.79%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.78%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.73%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.73%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.69%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.62%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.60%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.59%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.58%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.55%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.38%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.44%  live_migration   libc.so.6                [.] __pthread_mutex_lock
   1.41%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   1.37%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   0.80%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   0.78%  CPU 4/KVM        [kernel.kallsyms]        [k] _raw_read_lock
   0.78%  CPU 2/KVM        [kernel.kallsyms]        [k] _raw_read_lock
   0.77%  CPU 4/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
   0.77%  CPU 2/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
   0.76%  CPU 5/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
   0.75%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
   0.74%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   0.73%  CPU 5/KVM        [kernel.kallsyms]        [k] _raw_read_lock
   0.67%  CPU 0/KVM        [kernel.kallsyms]        [k] copy_page
   0.62%  CPU 0/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
   0.62%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl
   0.61%  CPU 0/KVM        [kernel.kallsyms]        [k] _raw_read_lock
   0.60%  CPU 2/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
   0.58%  CPU 5/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
   0.54%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   0.53%  CPU 4/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
   0.52%  CPU 0/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
   0.49%  live_migration   [kernel.kallsyms]        [k] kvm_set_pfn_dirty

1TB guest

precopy

Migration status: completed
total time: 83147 ms
downtime: 381 ms
setup: 265 ms
transferred ram: 19565544 kbytes
throughput: 1933.88 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 264135334 pages
skipped: 0 pages
normal: 4302604 pages
normal bytes: 17210416 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 0 kbytes
pages-per-second: 412882
precopy ram: 19085615 kbytes
downtime ram: 479929 kbytes

  43.50%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
  11.27%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   8.33%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   7.47%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   4.41%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   3.42%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   3.06%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   2.62%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   1.78%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
   1.43%  live_migration   qemu-system-x86_64       [.] find_next_bit
   1.13%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   1.12%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   0.70%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
   0.51%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
   0.49%  live_migration   qemu-system-x86_64       [.] save_page_header
   0.48%  live_migration   qemu-system-x86_64       [.] qemu_put_be64
   0.40%  live_migration   qemu-system-x86_64       [.] migrate_postcopy_ram
   0.40%  live_migration   qemu-system-x86_64       [.] runstate_is_running
   0.35%  live_migration   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.32%  live_migration   qemu-system-x86_64       [.] get_ptr_rcu_reader
   0.30%  live_migration   qemu-system-x86_64       [.] qemu_file_rate_limit
   0.30%  live_migration   qemu-system-x86_64       [.] migrate_use_xbzrle
   0.27%  live_migration   [kernel.kallsyms]        [k] __memcg_kmem_charge_page
   0.26%  live_migration   qemu-system-x86_64       [.] migrate_use_compression
   0.25%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
   0.25%  live_migration   qemu-system-x86_64       [.] qemu_file_get_error

upstream

Migration status: completed
total time: 72075 ms
downtime: 345 ms
setup: 287 ms
transferred ram: 19601046 kbytes
throughput: 2236.79 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 264134669 pages
normal: 4301611 pages
normal bytes: 17206444 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 17279539 kbytes
pages-per-second: 2458584
precopy ram: 2321505 kbytes
downtime ram: 1 kbytes
(qemu)

 39.09%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
  10.85%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   6.92%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   4.41%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.87%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   2.63%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   2.54%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   1.70%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
   1.31%  live_migration   qemu-system-x86_64       [.] find_next_bit
   1.11%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   1.05%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   0.80%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.79%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.78%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.78%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.76%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.75%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.75%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.73%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.73%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.72%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.72%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.71%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.71%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.69%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.66%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.65%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.63%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
   0.53%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.48%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
   0.44%  live_migration   qemu-system-x86_64       [.] save_page_header
   0.44%  live_migration   qemu-system-x86_64       [.] qemu_put_be64
   0.39%  live_migration   qemu-system-x86_64       [.] migrate_postcopy_ram
   0.36%  live_migration   qemu-system-x86_64       [.] runstate_is_running
   0.33%  live_migration   qemu-system-x86_64       [.] get_ptr_rcu_reader
   0.28%  live_migration   [kernel.kallsyms]        [k] __memcg_kmem_charge_page
   0.27%  live_migration   qemu-system-x86_64       [.] migrate_use_compression
   0.26%  live_migration   qemu-system-x86_64       [.] qemu_file_rate_limit
   0.26%  live_migration   qemu-system-x86_64       [.] migrate_use_xbzrle
   0.24%  live_migration   qemu-system-x86_64       [.] qemu_file_get_error
   0.21%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
   0.21%  live_migration   qemu-system-x86_64       [.] ram_transferred_add
   0.20%  live_migration   [kernel.kallsyms]        [k] try_charge_memcg
   0.19%  live_migration   qemu-system-x86_64       [.] ram_control_save_page
   0.18%  live_migration   qemu-system-x86_64       [.] buffer_is_zero
   0.18%  live_migration   qemu-system-x86_64       [.] cpu_physical_memory_set_dirty_lebitmap
   0.12%  live_migration   qemu-system-x86_64       [.] qemu_ram_pagesize
   0.11%  live_migration   [kernel.kallsyms]        [k] sync_regs
   0.11%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   0.11%  live_migration   [kernel.kallsyms]        [k] clear_page_erms
   0.11%  live_migration   [kernel.kallsyms]        [k] kernel_init_free_pages.part.0
   0.11%  live_migration   qemu-system-x86_64       [.] migrate_background_snapshot
   0.10%  live_migration   qemu-system-x86_64       [.] migrate_release_ram
   0.10%  live_migration   [kernel.kallsyms]        [k] pte_alloc_one
   0.10%  live_migration   libc.so.6                [.] __pthread_mutex_lock
   0.10%  live_migration   [kernel.kallsyms]        [k] native_irq_return_iret
   0.08%  live_migration   [kernel.kallsyms]        [k] kvm_clear_dirty_log_protect
   0.07%  qemu-system-x86  [kernel.kallsyms]        [k] free_pcp_prepare
   0.06%  qemu-system-x86  [kernel.kallsyms]        [k] __free_pages
   0.06%  live_migration   [kernel.kallsyms]        [k] tdp_iter_next
   0.05%  live_migration   qemu-system-x86_64       [.] cpu_physical_memory_sync_dirty_bitmap.con
   0.05%  live_migration   [kernel.kallsyms]        [k] __list_del_entry_valid
   0.05%  live_migration   [kernel.kallsyms]        [k] _raw_spin_lock_irqsave
   0.05%  multifdsend_2    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.05%  multifdsend_11   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.05%  live_migration   [vdso]                   [.] 0x00000000000006f5
   0.05%  multifdsend_15   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_1    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_13   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_4    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_8    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
   0.04%  multifdsend_0    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_9    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_14   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  live_migration   [kernel.kallsyms]        [k] kvm_arch_mmu_enable_log_dirty_pt_masked
   0.04%  live_migration   [kernel.kallsyms]        [k] obj_cgroup_charge_pages
   0.04%  multifdsend_7    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_12   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_5    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  multifdsend_10   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
   0.04%  live_migration   [kernel.kallsyms]        [k] _raw_spin_lock
   0.04%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl

1TB idle, zero page

Migration status: completed
total time: 52966 ms
downtime: 409 ms
setup: 273 ms
transferred ram: 879229325 kbytes
throughput: 136690.83 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 262093359 pages
skipped: 0 pages
normal: 4266123 pages
normal bytes: 17064492 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 879229317 kbytes
pages-per-second: 4024470
precopy ram: 874888589 kbytes
downtime ram: 4340735 kbytes

  14.42%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0          ◆
   2.97%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common                  ▒
   2.56%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable                  ▒
   2.50%  live_migration   qemu-system-x86_64       [.] multifd_queue_page                      ▒
   2.30%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic            ▒
   1.17%  live_migration   qemu-system-x86_64       [.] find_next_bit                           ▒
   1.12%  multifdsend_14   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.09%  live_migration   qemu-system-x86_64       [.] multifd_send_pages                      ▒
   1.08%  multifdsend_15   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.07%  multifdsend_11   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.03%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.03%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.03%  multifdsend_7    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.03%  multifdsend_4    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.02%  multifdsend_2    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.02%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.02%  multifdsend_9    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.02%  multifdsend_8    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.01%  multifdsend_6    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   1.00%  multifdsend_5    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   0.99%  live_migration   libc.so.6                [.] __pthread_mutex_lock                    ▒
   0.98%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   0.98%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   0.93%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared                   ▒
   0.93%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
   0.89%  live_migration   [kernel.kallsyms]        [k] futex_wake                              ▒
   0.83%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt          ▒
   0.70%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string          ▒
   0.69%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string

1TB: stress  stress --vm 4 --vm-bytes 512M

Wait until load in guest reach 3 before doing the migration

upstream

Migration status: completed
total time: 79912 ms
downtime: 345 ms
setup: 300 ms
transferred ram: 23723877 kbytes
throughput: 2441.21 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 263616778 pages
normal: 5330059 pages
normal bytes: 21320236 kbytes
dirty sync count: 4
page size: 4 kbytes
multifd bytes: 21406921 kbytes
pages-per-second: 2301580
precopy ram: 2316947 kbytes
downtime ram: 9 kbytes

  38.87%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   9.14%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   5.84%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   3.80%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.41%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   2.14%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   2.10%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   1.44%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
   1.17%  live_migration   qemu-system-x86_64       [.] find_next_bit
   0.95%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   0.91%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.89%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   0.88%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.87%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.84%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.84%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.80%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.79%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.79%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.78%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.78%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.78%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.77%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.76%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.75%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.74%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.70%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.66%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.58%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
   0.45%  live_migration   qemu-system-x86_64       [.] kvm_log_clear

zero page

Migration status: completed
total time: 58953 ms
downtime: 373 ms
setup: 348 ms
transferred ram: 972143021 kbytes
throughput: 135889.41 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 261357013 pages
skipped: 0 pages
normal: 5293916 pages
normal bytes: 21175664 kbytes
dirty sync count: 4
page size: 4 kbytes
multifd bytes: 972143012 kbytes
pages-per-second: 3699692
precopy ram: 968625243 kbytes
downtime ram: 3517778 kbytes

 12.91%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   2.85%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.16%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   2.05%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   1.17%  live_migration   qemu-system-x86_64       [.] multifd_queue_page
   1.13%  multifdsend_4    qemu-system-x86_64       [.] buffer_zero_avx512
   1.12%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512
   1.08%  live_migration   qemu-system-x86_64       [.] find_next_bit
   1.07%  multifdsend_14   qemu-system-x86_64       [.] buffer_zero_avx512
   1.07%  multifdsend_15   qemu-system-x86_64       [.] buffer_zero_avx512
   1.06%  multifdsend_2    qemu-system-x86_64       [.] buffer_zero_avx512
   1.06%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   1.06%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
   1.04%  multifdsend_9    qemu-system-x86_64       [.] buffer_zero_avx512
   1.04%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512
   1.04%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512
   1.03%  multifdsend_11   qemu-system-x86_64       [.] buffer_zero_avx512
   1.01%  multifdsend_5    qemu-system-x86_64       [.] buffer_zero_avx512
   0.99%  multifdsend_7    qemu-system-x86_64       [.] buffer_zero_avx512
   0.98%  multifdsend_6    qemu-system-x86_64       [.] buffer_zero_avx512
   0.98%  multifdsend_8    qemu-system-x86_64       [.] buffer_zero_avx512
   0.95%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512
   0.94%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512
   0.92%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512
   0.89%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   0.85%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.84%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.84%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.81%  live_migration   libc.so.6                [.] __pthread_mutex_lock

1TB: stress  stress --vm 4 --vm-bytes 1024M

upstream

Migration status: completed
total time: 79302 ms
downtime: 315 ms
setup: 307 ms
transferred ram: 30307307 kbytes
throughput: 3142.99 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 263089198 pages
skipped: 0 pages
normal: 6972933 pages
normal bytes: 27891732 kbytes
dirty sync count: 7
page size: 4 kbytes
multifd bytes: 27994987 kbytes
pages-per-second: 1875902
precopy ram: 2312314 kbytes
downtime ram: 4 kbytes

  35.46%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   9.27%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   6.02%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   3.68%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.64%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   2.51%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   2.31%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   1.46%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
   1.23%  live_migration   qemu-system-x86_64       [.] find_next_bit
   1.05%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.03%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.01%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.01%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.01%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.00%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.99%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.99%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.99%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.96%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   0.95%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.93%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.91%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   0.90%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.87%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
   0.87%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.82%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.82%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.65%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.58%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
   0.47%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string

zero_page

900GB dirty + idle

mig_mon mm_dirty -m 10000 -p once

upstream

Migration status: completed
total time: 104587 ms
downtime: 381 ms
setup: 311 ms
transferred ram: 943318066 kbytes
throughput: 74107.80 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 33298094 pages
skipped: 0 pages
normal: 235142522 pages
normal bytes: 940570088 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 943025391 kbytes
pages-per-second: 3331126
precopy ram: 292673 kbytes
downtime ram: 1 kbytes

  7.71%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   4.55%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.48%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.36%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.36%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.31%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.29%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.27%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.23%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.17%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.06%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.94%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.89%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.59%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.25%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.12%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   2.72%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.54%  live_migration   [kernel.kallsyms]        [k] copy_page
   1.39%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   0.86%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   0.50%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   0.49%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   0.26%  multifdsend_7    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.25%  multifdsend_4    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.25%  multifdsend_10   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.25%  multifdsend_9    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.25%  multifdsend_15   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.24%  multifdsend_12   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.23%  multifdsend_5    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.23%  multifdsend_0    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.23%  multifdsend_3    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.21%  multifdsend_14   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.18%  live_migration   qemu-system-x86_64       [.] find_next_bit

Migration status: completed
total time: 75601 ms
downtime: 427 ms
setup: 269 ms
transferred ram: 1083999214 kbytes
throughput: 117879.85 mbps
remaining ram: 0 kbytes
total ram: 1073742600 kbytes
duplicate: 32991750 pages
skipped: 0 pages
normal: 232638485 pages
normal bytes: 930553940 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 1083999202 kbytes
pages-per-second: 3669333
precopy ram: 1080197079 kbytes
downtime ram: 3802134 kbytes

   4.41%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.38%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.37%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.32%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.29%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.29%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.28%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.27%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.16%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.09%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.07%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.07%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.07%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.07%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.07%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.07%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.59%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   1.59%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   1.39%  live_migration   libc.so.6                [.] __pthread_mutex_lock
   0.80%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
   0.65%  multifdsend_14   [kernel.kallsyms]        [k] copy_page
   0.63%  multifdsend_1    [kernel.kallsyms]        [k] copy_page
   0.58%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl
   0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
   0.40%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   0.29%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   0.26%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable

4TB idle

upstream

Migration status: completed
total time: 317054 ms
downtime: 552 ms
setup: 1045 ms
transferred ram: 77208692 kbytes
throughput: 2001.52 mbps
remaining ram: 0 kbytes
total ram: 4294968072 kbytes
duplicate: 1056844269 pages
skipped: 0 pages
normal: 16904683 pages
normal bytes: 67618732 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 67919974 kbytes
pages-per-second: 3477766
precopy ram: 9288715 kbytes
downtime ram: 2 kbytes

 44.27%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
  10.21%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   6.58%  live_migration   qemu-system-x86_64       [.] add_to_iovec
   4.25%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.70%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
   2.43%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   2.34%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   1.59%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
   1.30%  live_migration   qemu-system-x86_64       [.] find_next_bit
   1.08%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   0.98%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   0.78%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.74%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.70%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.68%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.67%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.66%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.66%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.64%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.62%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.61%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
   0.56%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.55%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.54%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.52%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.52%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.52%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.51%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.49%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.45%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
   0.42%  live_migration   qemu-system-x86_64       [.] save_page_header
   0.41%  live_migration   qemu-system-x86_64       [.] qemu_put_be64
   0.35%  live_migration   qemu-system-x86_64       [.] migrate_postcopy_ram

zero_page

Migration status: completed
total time: 215567 ms
downtime: 500 ms
setup: 1040 ms
transferred ram: 3587151463 kbytes
throughput: 136980.19 mbps
remaining ram: 0 kbytes
total ram: 4294968072 kbytes
duplicate: 1048466740 pages
skipped: 0 pages
normal: 16747893 pages
normal bytes: 66991572 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 3587151430 kbytes
pages-per-second: 4104960
precopy ram: 3583004863 kbytes
downtime ram: 4146599 kbytes

 15.49%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   3.20%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   2.67%  live_migration   qemu-system-x86_64       [.] multifd_queue_page
   2.33%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   2.19%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   1.19%  live_migration   qemu-system-x86_64       [.] find_next_bit
   1.18%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
   1.14%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
   1.02%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512
   1.01%  multifdsend_9    qemu-system-x86_64       [.] buffer_zero_avx512
   1.01%  multifdsend_8    qemu-system-x86_64       [.] buffer_zero_avx512
   1.00%  multifdsend_5    qemu-system-x86_64       [.] buffer_zero_avx512
   1.00%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512
   1.00%  multifdsend_15   qemu-system-x86_64       [.] buffer_zero_avx512
   0.99%  multifdsend_2    qemu-system-x86_64       [.] buffer_zero_avx512
   0.99%  multifdsend_6    qemu-system-x86_64       [.] buffer_zero_avx512
   0.99%  multifdsend_14   qemu-system-x86_64       [.] buffer_zero_avx512
   0.99%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512
   0.98%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512
   0.97%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512
   0.97%  multifdsend_7    qemu-system-x86_64       [.] buffer_zero_avx512
   0.96%  live_migration   [kernel.kallsyms]        [k] futex_wake
   0.96%  multifdsend_11   qemu-system-x86_64       [.] buffer_zero_avx512
   0.93%  multifdsend_4    qemu-system-x86_64       [.] buffer_zero_avx512
   0.88%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512
   0.81%  live_migration   [kernel.kallsyms]        [k] send_call_function_single_ipi
   0.71%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
   0.63%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string

4TB dirty + idle

    mig_mon mm_dirty -m 3900000 -p once

upstream

Migration status: completed
total time: 357581 ms
downtime: 553 ms
setup: 1295 ms
transferred ram: 4080035248 kbytes
throughput: 93811.30 mbps
remaining ram: 0 kbytes
total ram: 4294968072 kbytes
duplicate: 56507728 pages
skipped: 0 pages
normal: 1017239053 pages
normal bytes: 4068956212 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 4079538545 kbytes
pages-per-second: 3610116
precopy ram: 496701 kbytes
downtime ram: 2 kbytes

   5.07%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
   4.99%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.99%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.97%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.96%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.95%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.91%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.65%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.56%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.33%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.16%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.83%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.79%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.75%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.73%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.58%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   0.95%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   0.88%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
   0.36%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
   0.32%  multifdsend_4    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.30%  multifdsend_5    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.30%  multifdsend_2    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.30%  multifdsend_0    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.30%  multifdsend_9    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.30%  multifdsend_7    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.30%  multifdsend_10   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.26%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
   0.22%  multifdsend_8    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.22%  multifdsend_11   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.19%  multifdsend_13   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.19%  multifdsend_3    [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.17%  multifdsend_12   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.15%  multifdsend_14   [kernel.kallsyms]        [k] tcp_sendmsg_locked
   0.14%  multifdsend_10   [kernel.kallsyms]        [k] _copy_from_iter

zero_page

Migration status: completed
total time: 317428 ms
downtime: 744 ms
setup: 1192 ms
transferred ram: 4340691359 kbytes
throughput: 112444.34 mbps
remaining ram: 0 kbytes
total ram: 4294968072 kbytes
duplicate: 55993692 pages
normal: 1005801180 pages
normal bytes: 4023204720 kbytes
dirty sync count: 3
page size: 4 kbytes
multifd bytes: 4340691312 kbytes
pages-per-second: 3417846
precopy ram: 4336921795 kbytes
downtime ram: 3769564 kbytes

  4.38%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.38%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.37%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.34%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.29%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.28%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.27%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.26%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.23%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.18%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   4.18%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.90%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.86%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.84%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.73%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   3.73%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
   1.59%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
   1.45%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
   1.28%  live_migration   libc.so.6                [.] __pthread_mutex_lock
   1.02%  multifdsend_8    [kernel.kallsyms]        [k] copy_page
   0.96%  multifdsend_15   [kernel.kallsyms]        [k] copy_page
   0.83%  multifdsend_14   [kernel.kallsyms]        [k] copy_page
   0.81%  multifdsend_7    [kernel.kallsyms]        [k] copy_page
   0.75%  multifdsend_0    [kernel.kallsyms]        [k] copy_page
   0.69%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
   0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
   0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl

[v5]

In this version:
- Rebase to latest
- Address all comments
- statistics about zero pages are right now (or at least much better than before)
- changed how we calculate the amount of transferred ram
- numbers, who don't like numbers.

Everything has been checked with a guest launched like the following
command.  Migration is running through localhost.  Will send numbers
with real hardware as soon as I get access to the machines that have
it (I checked with previous versions already, but not this one).

[removed example]

Please review, Juan.

[v4]
In this version
- Rebase to latest
- Address all comments from previous versions
- code cleanup

Please review.

[v2]
This is a rebase against last master.

And the reason for resend is to configure properly git-publish and
hope this time that git-publish send all the patches.

Please, review.

[v1]
Since Friday version:
- More cleanups on the code
- Remove repeated calls to qemu_target_page_size()
- Establish normal pages and zero pages
- detect zero pages on the multifd threads
- send zero pages through the multifd channels.
- reviews by Richard addressed.

It pases migration-test, so it should be perfect O:+)

ToDo for next version:
- check the version changes
  I need that 6.2 is out to check for 7.0.
  This code don't exist at all due to that reason.
- Send measurements of the differences

Please, review.

[

Friday version that just created a single writev instead of
write+writev.

]

Right now, multifd does a write() for the header and a writev() for
each group of pages.  Simplify it so we send the header as another
member of the IOV.

Once there, I got several simplifications:
* is_zero_range() was used only once, just use its body.
* same with is_zero_page().
* Be consintent and use offset insed the ramblock everywhere.
* Now that we have the offsets of the ramblock, we can drop the iov.
* Now that nothing uses iov's except NOCOMP method, move the iovs
  from pages to methods.
* Now we can use iov's with a single field for zlib/zstd.
* send_write() method is the same in all the implementaitons, so use
  it directly.
* Now, we can use a single writev() to write everything.

ToDo: Move zero page detection to the multifd thrteads.

With RAM in the Terabytes size, the detection of the zero page takes
too much time on the main thread.

Last patch on the series removes the detection of zero pages in the
main thread for multifd.  In the next series post, I will add how to
detect the zero pages and send them on multifd channels.

Please review.

Later, Juan.

Juan Quintela (13):
  multifd: Document the locking of MultiFD{Send/Recv}Params
  multifd: Create page_size fields into both MultiFD{Recv,Send}Params
  multifd: Create page_count fields into both MultiFD{Recv,Send}Params
  migration: Export ram_transferred_ram()
  multifd: Count the number of bytes sent correctly
  migration: Make ram_save_target_page() a pointer
  multifd: Make flags field thread local
  multifd: Prepare to send a packet without the mutex held
  multifd: Add property to enable/disable zero_page
  migration: Export ram_release_page()
  multifd: Support for zero pages transmission
  multifd: Zero pages transmission
  migration: Use multifd before we check for the zero page

 migration/migration.h    |   3 +
 migration/multifd.h      | 118 +++++++++++++++++++++++----------
 migration/ram.h          |   3 +
 hw/core/machine.c        |   4 +-
 migration/migration.c    |  11 +++
 migration/multifd-zlib.c |  12 ++--
 migration/multifd-zstd.c |  12 ++--
 migration/multifd.c      | 140 ++++++++++++++++++++++++++++-----------
 migration/ram.c          |  48 +++++++++++---
 migration/trace-events   |   8 +--
 10 files changed, 258 insertions(+), 101 deletions(-)

-- 
2.35.1



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-16 13:14   ` Dr. David Alan Gilbert
  2022-05-10 22:42 ` [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv, Send}Params Juan Quintela
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

Reorder the structures so we can know if the fields are:
- Read only
- Their own locking (i.e. sems)
- Protected by 'mutex'
- Only for the multifd channel

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h | 86 +++++++++++++++++++++++++++------------------
 1 file changed, 51 insertions(+), 35 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 7d0effcb03..f1f88c6737 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -65,7 +65,9 @@ typedef struct {
 } MultiFDPages_t;
 
 typedef struct {
-    /* this fields are not changed once the thread is created */
+    /* Fiields are only written at creating/deletion time */
+    /* No lock required for them, they are read only */
+
     /* channel number */
     uint8_t id;
     /* channel thread name */
@@ -74,37 +76,45 @@ typedef struct {
     QemuThread thread;
     /* communication channel */
     QIOChannel *c;
-    /* sem where to wait for more work */
-    QemuSemaphore sem;
-    /* this mutex protects the following parameters */
-    QemuMutex mutex;
-    /* is this channel thread running */
-    bool running;
-    /* should this thread finish */
-    bool quit;
     /* is the yank function registered */
     bool registered_yank;
+    /* packet allocated len */
+    uint32_t packet_len;
+
+    /* sem where to wait for more work */
+    QemuSemaphore sem;
+    /* syncs main thread and channels */
+    QemuSemaphore sem_sync;
+
+    /* this mutex protects the following parameters */
+    QemuMutex mutex;
+    /* is this channel thread running */
+    bool running;
+    /* should this thread finish */
+    bool quit;
+    /* multifd flags for each packet */
+    uint32_t flags;
+    /* global number of generated multifd packets */
+    uint64_t packet_num;
     /* thread has work to do */
     int pending_job;
-    /* array of pages to sent */
+    /* array of pages to sent.
+     * The owner of 'pages' depends of 'pending_job' value:
+     * pending_job == 0 -> migration_thread can use it.
+     * pending_job != 0 -> multifd_channel can use it.
+     */
     MultiFDPages_t *pages;
-    /* packet allocated len */
-    uint32_t packet_len;
+
+    /* thread local variables. No locking required */
+
     /* pointer to the packet */
     MultiFDPacket_t *packet;
-    /* multifd flags for each packet */
-    uint32_t flags;
     /* size of the next packet that contains pages */
     uint32_t next_packet_size;
-    /* global number of generated multifd packets */
-    uint64_t packet_num;
-    /* thread local variables */
     /* packets sent through this channel */
     uint64_t num_packets;
     /* non zero pages sent through this channel */
     uint64_t total_normal_pages;
-    /* syncs main thread and channels */
-    QemuSemaphore sem_sync;
     /* buffers to send */
     struct iovec *iov;
     /* number of iovs used */
@@ -118,7 +128,9 @@ typedef struct {
 }  MultiFDSendParams;
 
 typedef struct {
-    /* this fields are not changed once the thread is created */
+    /* Fiields are only written at creating/deletion time */
+    /* No lock required for them, they are read only */
+
     /* channel number */
     uint8_t id;
     /* channel thread name */
@@ -127,31 +139,35 @@ typedef struct {
     QemuThread thread;
     /* communication channel */
     QIOChannel *c;
+    /* packet allocated len */
+    uint32_t packet_len;
+
+    /* syncs main thread and channels */
+    QemuSemaphore sem_sync;
+
     /* this mutex protects the following parameters */
     QemuMutex mutex;
     /* is this channel thread running */
     bool running;
     /* should this thread finish */
     bool quit;
+    /* multifd flags for each packet */
+    uint32_t flags;
+    /* global number of generated multifd packets */
+    uint64_t packet_num;
+
+    /* thread local variables. No locking required */
+
+    /* pointer to the packet */
+    MultiFDPacket_t *packet;
+    /* size of the next packet that contains pages */
+    uint32_t next_packet_size;
+    /* packets sent through this channel */
+    uint64_t num_packets;
     /* ramblock host address */
     uint8_t *host;
-    /* packet allocated len */
-    uint32_t packet_len;
-    /* pointer to the packet */
-    MultiFDPacket_t *packet;
-    /* multifd flags for each packet */
-    uint32_t flags;
-    /* global number of generated multifd packets */
-    uint64_t packet_num;
-    /* thread local variables */
-    /* size of the next packet that contains pages */
-    uint32_t next_packet_size;
-    /* packets sent through this channel */
-    uint64_t num_packets;
     /* non zero pages recv through this channel */
     uint64_t total_normal_pages;
-    /* syncs main thread and channels */
-    QemuSemaphore sem_sync;
     /* buffers to recv */
     struct iovec *iov;
     /* Pages that are not zero */
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv, Send}Params
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-17  8:44   ` [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv,Send}Params Dr. David Alan Gilbert
  2022-05-10 22:42 ` [PATCH v6 03/13] multifd: Create page_count fields into both MultiFD{Recv, Send}Params Juan Quintela
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

We were calling qemu_target_page_size() left and right.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h      |  4 ++++
 migration/multifd-zlib.c | 12 +++++-------
 migration/multifd-zstd.c | 12 +++++-------
 migration/multifd.c      | 18 ++++++++----------
 4 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index f1f88c6737..4de80d9e53 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -80,6 +80,8 @@ typedef struct {
     bool registered_yank;
     /* packet allocated len */
     uint32_t packet_len;
+    /* guest page size */
+    uint32_t page_size;
 
     /* sem where to wait for more work */
     QemuSemaphore sem;
@@ -141,6 +143,8 @@ typedef struct {
     QIOChannel *c;
     /* packet allocated len */
     uint32_t packet_len;
+    /* guest page size */
+    uint32_t page_size;
 
     /* syncs main thread and channels */
     QemuSemaphore sem_sync;
diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
index 3a7ae44485..28349ff2e0 100644
--- a/migration/multifd-zlib.c
+++ b/migration/multifd-zlib.c
@@ -100,7 +100,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp)
 static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
 {
     struct zlib_data *z = p->data;
-    size_t page_size = qemu_target_page_size();
     z_stream *zs = &z->zs;
     uint32_t out_size = 0;
     int ret;
@@ -114,7 +113,7 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
             flush = Z_SYNC_FLUSH;
         }
 
-        zs->avail_in = page_size;
+        zs->avail_in = p->page_size;
         zs->next_in = p->pages->block->host + p->normal[i];
 
         zs->avail_out = available;
@@ -220,12 +219,11 @@ static void zlib_recv_cleanup(MultiFDRecvParams *p)
 static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
 {
     struct zlib_data *z = p->data;
-    size_t page_size = qemu_target_page_size();
     z_stream *zs = &z->zs;
     uint32_t in_size = p->next_packet_size;
     /* we measure the change of total_out */
     uint32_t out_size = zs->total_out;
-    uint32_t expected_size = p->normal_num * page_size;
+    uint32_t expected_size = p->normal_num * p->page_size;
     uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
     int ret;
     int i;
@@ -252,7 +250,7 @@ static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
             flush = Z_SYNC_FLUSH;
         }
 
-        zs->avail_out = page_size;
+        zs->avail_out = p->page_size;
         zs->next_out = p->host + p->normal[i];
 
         /*
@@ -266,8 +264,8 @@ static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
         do {
             ret = inflate(zs, flush);
         } while (ret == Z_OK && zs->avail_in
-                             && (zs->total_out - start) < page_size);
-        if (ret == Z_OK && (zs->total_out - start) < page_size) {
+                             && (zs->total_out - start) < p->page_size);
+        if (ret == Z_OK && (zs->total_out - start) < p->page_size) {
             error_setg(errp, "multifd %u: inflate generated too few output",
                        p->id);
             return -1;
diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c
index d788d309f2..f4a8e1ed1f 100644
--- a/migration/multifd-zstd.c
+++ b/migration/multifd-zstd.c
@@ -113,7 +113,6 @@ static void zstd_send_cleanup(MultiFDSendParams *p, Error **errp)
 static int zstd_send_prepare(MultiFDSendParams *p, Error **errp)
 {
     struct zstd_data *z = p->data;
-    size_t page_size = qemu_target_page_size();
     int ret;
     uint32_t i;
 
@@ -128,7 +127,7 @@ static int zstd_send_prepare(MultiFDSendParams *p, Error **errp)
             flush = ZSTD_e_flush;
         }
         z->in.src = p->pages->block->host + p->normal[i];
-        z->in.size = page_size;
+        z->in.size = p->page_size;
         z->in.pos = 0;
 
         /*
@@ -241,8 +240,7 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
 {
     uint32_t in_size = p->next_packet_size;
     uint32_t out_size = 0;
-    size_t page_size = qemu_target_page_size();
-    uint32_t expected_size = p->normal_num * page_size;
+    uint32_t expected_size = p->normal_num * p->page_size;
     uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
     struct zstd_data *z = p->data;
     int ret;
@@ -265,7 +263,7 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
 
     for (i = 0; i < p->normal_num; i++) {
         z->out.dst = p->host + p->normal[i];
-        z->out.size = page_size;
+        z->out.size = p->page_size;
         z->out.pos = 0;
 
         /*
@@ -279,8 +277,8 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
         do {
             ret = ZSTD_decompressStream(z->zds, &z->out, &z->in);
         } while (ret > 0 && (z->in.size - z->in.pos > 0)
-                         && (z->out.pos < page_size));
-        if (ret > 0 && (z->out.pos < page_size)) {
+                         && (z->out.pos < p->page_size));
+        if (ret > 0 && (z->out.pos < p->page_size)) {
             error_setg(errp, "multifd %u: decompressStream buffer too small",
                        p->id);
             return -1;
diff --git a/migration/multifd.c b/migration/multifd.c
index 9ea4f581e2..f15fed5f1f 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -87,15 +87,14 @@ static void nocomp_send_cleanup(MultiFDSendParams *p, Error **errp)
 static int nocomp_send_prepare(MultiFDSendParams *p, Error **errp)
 {
     MultiFDPages_t *pages = p->pages;
-    size_t page_size = qemu_target_page_size();
 
     for (int i = 0; i < p->normal_num; i++) {
         p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
-        p->iov[p->iovs_num].iov_len = page_size;
+        p->iov[p->iovs_num].iov_len = p->page_size;
         p->iovs_num++;
     }
 
-    p->next_packet_size = p->normal_num * page_size;
+    p->next_packet_size = p->normal_num * p->page_size;
     p->flags |= MULTIFD_FLAG_NOCOMP;
     return 0;
 }
@@ -139,7 +138,6 @@ static void nocomp_recv_cleanup(MultiFDRecvParams *p)
 static int nocomp_recv_pages(MultiFDRecvParams *p, Error **errp)
 {
     uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
-    size_t page_size = qemu_target_page_size();
 
     if (flags != MULTIFD_FLAG_NOCOMP) {
         error_setg(errp, "multifd %u: flags received %x flags expected %x",
@@ -148,7 +146,7 @@ static int nocomp_recv_pages(MultiFDRecvParams *p, Error **errp)
     }
     for (int i = 0; i < p->normal_num; i++) {
         p->iov[i].iov_base = p->host + p->normal[i];
-        p->iov[i].iov_len = page_size;
+        p->iov[i].iov_len = p->page_size;
     }
     return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
 }
@@ -281,8 +279,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
 static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
 {
     MultiFDPacket_t *packet = p->packet;
-    size_t page_size = qemu_target_page_size();
-    uint32_t page_count = MULTIFD_PACKET_SIZE / page_size;
+    uint32_t page_count = MULTIFD_PACKET_SIZE / p->page_size;
     RAMBlock *block;
     int i;
 
@@ -344,7 +341,7 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
     for (i = 0; i < p->normal_num; i++) {
         uint64_t offset = be64_to_cpu(packet->offset[i]);
 
-        if (offset > (block->used_length - page_size)) {
+        if (offset > (block->used_length - p->page_size)) {
             error_setg(errp, "multifd: offset too long %" PRIu64
                        " (max " RAM_ADDR_FMT ")",
                        offset, block->used_length);
@@ -433,8 +430,7 @@ static int multifd_send_pages(QEMUFile *f)
     p->packet_num = multifd_send_state->packet_num++;
     multifd_send_state->pages = p->pages;
     p->pages = pages;
-    transferred = ((uint64_t) pages->num) * qemu_target_page_size()
-                + p->packet_len;
+    transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len;
     qemu_file_update_transfer(f, transferred);
     ram_counters.multifd_bytes += transferred;
     ram_counters.transferred += transferred;
@@ -898,6 +894,7 @@ int multifd_save_setup(Error **errp)
         /* We need one extra place for the packet header */
         p->iov = g_new0(struct iovec, page_count + 1);
         p->normal = g_new0(ram_addr_t, page_count);
+        p->page_size = qemu_target_page_size();
         socket_send_channel_create(multifd_new_send_channel_async, p);
     }
 
@@ -1138,6 +1135,7 @@ int multifd_load_setup(Error **errp)
         p->name = g_strdup_printf("multifdrecv_%d", i);
         p->iov = g_new0(struct iovec, page_count);
         p->normal = g_new0(ram_addr_t, page_count);
+        p->page_size = qemu_target_page_size();
     }
 
     for (i = 0; i < thread_count; i++) {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 03/13] multifd: Create page_count fields into both MultiFD{Recv, Send}Params
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv, Send}Params Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 04/13] migration: Export ram_transferred_ram() Juan Quintela
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

We were recalculating it left and right.  We plan to change that
values on next patches.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h | 4 ++++
 migration/multifd.c | 7 ++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 4de80d9e53..f707e2a8b8 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -82,6 +82,8 @@ typedef struct {
     uint32_t packet_len;
     /* guest page size */
     uint32_t page_size;
+    /* number of pages in a full packet */
+    uint32_t page_count;
 
     /* sem where to wait for more work */
     QemuSemaphore sem;
@@ -145,6 +147,8 @@ typedef struct {
     uint32_t packet_len;
     /* guest page size */
     uint32_t page_size;
+    /* number of pages in a full packet */
+    uint32_t page_count;
 
     /* syncs main thread and channels */
     QemuSemaphore sem_sync;
diff --git a/migration/multifd.c b/migration/multifd.c
index f15fed5f1f..893b90072d 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -279,7 +279,6 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
 static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
 {
     MultiFDPacket_t *packet = p->packet;
-    uint32_t page_count = MULTIFD_PACKET_SIZE / p->page_size;
     RAMBlock *block;
     int i;
 
@@ -306,10 +305,10 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
      * If we received a packet that is 100 times bigger than expected
      * just stop migration.  It is a magic number.
      */
-    if (packet->pages_alloc > page_count) {
+    if (packet->pages_alloc > p->page_count) {
         error_setg(errp, "multifd: received packet "
                    "with size %u and expected a size of %u",
-                   packet->pages_alloc, page_count) ;
+                   packet->pages_alloc, p->page_count) ;
         return -1;
     }
 
@@ -895,6 +894,7 @@ int multifd_save_setup(Error **errp)
         p->iov = g_new0(struct iovec, page_count + 1);
         p->normal = g_new0(ram_addr_t, page_count);
         p->page_size = qemu_target_page_size();
+        p->page_count = page_count;
         socket_send_channel_create(multifd_new_send_channel_async, p);
     }
 
@@ -1135,6 +1135,7 @@ int multifd_load_setup(Error **errp)
         p->name = g_strdup_printf("multifdrecv_%d", i);
         p->iov = g_new0(struct iovec, page_count);
         p->normal = g_new0(ram_addr_t, page_count);
+        p->page_count = page_count;
         p->page_size = qemu_target_page_size();
     }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 04/13] migration: Export ram_transferred_ram()
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (2 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 03/13] multifd: Create page_count fields into both MultiFD{Recv, Send}Params Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 05/13] multifd: Count the number of bytes sent correctly Juan Quintela
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.h | 2 ++
 migration/ram.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/ram.h b/migration/ram.h
index ded0a3a086..7b641adc55 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -65,6 +65,8 @@ int ram_load_postcopy(QEMUFile *f);
 
 void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
 
+void ram_transferred_add(uint64_t bytes);
+
 int ramblock_recv_bitmap_test(RAMBlock *rb, void *host_addr);
 bool ramblock_recv_bitmap_test_byte_offset(RAMBlock *rb, uint64_t byte_offset);
 void ramblock_recv_bitmap_set(RAMBlock *rb, void *host_addr);
diff --git a/migration/ram.c b/migration/ram.c
index a2489a2699..738769ba15 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -394,7 +394,7 @@ uint64_t ram_bytes_remaining(void)
 
 MigrationStats ram_counters;
 
-static void ram_transferred_add(uint64_t bytes)
+void ram_transferred_add(uint64_t bytes)
 {
     if (runstate_is_running()) {
         ram_counters.precopy_bytes += bytes;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 05/13] multifd: Count the number of bytes sent correctly
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (3 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 04/13] migration: Export ram_transferred_ram() Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 06/13] migration: Make ram_save_target_page() a pointer Juan Quintela
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

Current code asumes that all pages are whole.  That is not true for
example for compression already.  Fix it for creating a new field
->sent_bytes that includes it.

All ram_counters are used only from the migration thread, so we have
two options:
- put a mutex and fill everything when we sent it (not only
ram_counters, also qemu_file->xfer_bytes).
- Create a local variable that implements how much has been sent
through each channel.  And when we push another packet, we "add" the
previous stats.

I choose two due to less changes overall.  On the previous code we
increase transferred and then we sent.  Current code goes the other
way around.  It sents the data, and after the fact, it updates the
counters.  Notice that each channel can have a maximum of half a
megabyte of data without counting, so it is not very important.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h |  2 ++
 migration/multifd.c | 14 ++++++--------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index f707e2a8b8..b29be5de06 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -100,6 +100,8 @@ typedef struct {
     uint32_t flags;
     /* global number of generated multifd packets */
     uint64_t packet_num;
+    /* How many bytes have we sent on the last packet */
+    uint64_t sent_bytes;
     /* thread has work to do */
     int pending_job;
     /* array of pages to sent.
diff --git a/migration/multifd.c b/migration/multifd.c
index 893b90072d..427cbe2ceb 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -394,7 +394,6 @@ static int multifd_send_pages(QEMUFile *f)
     static int next_channel;
     MultiFDSendParams *p = NULL; /* make happy gcc */
     MultiFDPages_t *pages = multifd_send_state->pages;
-    uint64_t transferred;
 
     if (qatomic_read(&multifd_send_state->exiting)) {
         return -1;
@@ -429,10 +428,10 @@ static int multifd_send_pages(QEMUFile *f)
     p->packet_num = multifd_send_state->packet_num++;
     multifd_send_state->pages = p->pages;
     p->pages = pages;
-    transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len;
-    qemu_file_update_transfer(f, transferred);
-    ram_counters.multifd_bytes += transferred;
-    ram_counters.transferred += transferred;
+    ram_transferred_add(p->sent_bytes);
+    ram_counters.multifd_bytes += p->sent_bytes;
+    qemu_file_update_transfer(f, p->sent_bytes);
+    p->sent_bytes = 0;
     qemu_mutex_unlock(&p->mutex);
     qemu_sem_post(&p->sem);
 
@@ -590,9 +589,6 @@ void multifd_send_sync_main(QEMUFile *f)
         p->packet_num = multifd_send_state->packet_num++;
         p->flags |= MULTIFD_FLAG_SYNC;
         p->pending_job++;
-        qemu_file_update_transfer(f, p->packet_len);
-        ram_counters.multifd_bytes += p->packet_len;
-        ram_counters.transferred += p->packet_len;
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_post(&p->sem);
     }
@@ -668,6 +664,8 @@ static void *multifd_send_thread(void *opaque)
             }
 
             qemu_mutex_lock(&p->mutex);
+            p->sent_bytes += p->packet_len;;
+            p->sent_bytes += p->next_packet_size;
             p->pending_job--;
             qemu_mutex_unlock(&p->mutex);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 06/13] migration: Make ram_save_target_page() a pointer
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (4 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 05/13] multifd: Count the number of bytes sent correctly Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 07/13] multifd: Make flags field thread local Juan Quintela
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

We are going to create a new function for multifd latest in the series.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 738769ba15..14269a2671 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -295,6 +295,9 @@ struct RAMSrcPageRequest {
     QSIMPLEQ_ENTRY(RAMSrcPageRequest) next_req;
 };
 
+typedef struct RAMState RAMState;
+typedef struct PageSearchStatus PageSearchStatus;
+
 /* State of RAM for migration */
 struct RAMState {
     /* QEMUFile used for this migration */
@@ -349,8 +352,8 @@ struct RAMState {
     /* Queue of outstanding page requests from the destination */
     QemuMutex src_page_req_mutex;
     QSIMPLEQ_HEAD(, RAMSrcPageRequest) src_page_requests;
+    int (*ram_save_target_page)(RAMState *rs, PageSearchStatus *pss);
 };
-typedef struct RAMState RAMState;
 
 static RAMState *ram_state;
 
@@ -2132,14 +2135,14 @@ static bool save_compress_page(RAMState *rs, RAMBlock *block, ram_addr_t offset)
 }
 
 /**
- * ram_save_target_page: save one target page
+ * ram_save_target_page_legacy: save one target page
  *
  * Returns the number of pages written
  *
  * @rs: current RAM state
  * @pss: data about the page we want to send
  */
-static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
+static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
 {
     RAMBlock *block = pss->block;
     ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
@@ -2214,7 +2217,7 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
     do {
         /* Check the pages is dirty and if it is send it */
         if (migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
-            tmppages = ram_save_target_page(rs, pss);
+            tmppages = rs->ram_save_target_page(rs, pss);
             if (tmppages < 0) {
                 return tmppages;
             }
@@ -2943,6 +2946,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     ram_control_before_iterate(f, RAM_CONTROL_SETUP);
     ram_control_after_iterate(f, RAM_CONTROL_SETUP);
 
+    (*rsp)->ram_save_target_page = ram_save_target_page_legacy;
     multifd_send_sync_main(f);
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     qemu_fflush(f);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 07/13] multifd: Make flags field thread local
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (5 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 06/13] migration: Make ram_save_target_page() a pointer Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 08/13] multifd: Prepare to send a packet without the mutex held Juan Quintela
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

Use of flags with respect to locking was incensistant.  For the
sending side:
- it was set to 0 with mutex held on the multifd channel.
- MULTIFD_FLAG_SYNC was set with mutex held on the migration thread.
- Everything else was done without the mutex held on the multifd channel.

On the reception side, it is not used on the migration thread, only on
the multifd channels threads.

So we move it to the multifd channels thread only variables, and we
introduce a new bool sync_needed on the send side to pass that information.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h | 10 ++++++----
 migration/multifd.c | 23 +++++++++++++----------
 2 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index b29be5de06..ecf5a8e868 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -96,12 +96,12 @@ typedef struct {
     bool running;
     /* should this thread finish */
     bool quit;
-    /* multifd flags for each packet */
-    uint32_t flags;
     /* global number of generated multifd packets */
     uint64_t packet_num;
     /* How many bytes have we sent on the last packet */
     uint64_t sent_bytes;
+    /* Do we need to do an iteration sync */
+    bool sync_needed;
     /* thread has work to do */
     int pending_job;
     /* array of pages to sent.
@@ -115,6 +115,8 @@ typedef struct {
 
     /* pointer to the packet */
     MultiFDPacket_t *packet;
+    /* multifd flags for each packet */
+    uint32_t flags;
     /* size of the next packet that contains pages */
     uint32_t next_packet_size;
     /* packets sent through this channel */
@@ -161,8 +163,6 @@ typedef struct {
     bool running;
     /* should this thread finish */
     bool quit;
-    /* multifd flags for each packet */
-    uint32_t flags;
     /* global number of generated multifd packets */
     uint64_t packet_num;
 
@@ -170,6 +170,8 @@ typedef struct {
 
     /* pointer to the packet */
     MultiFDPacket_t *packet;
+    /* multifd flags for each packet */
+    uint32_t flags;
     /* size of the next packet that contains pages */
     uint32_t next_packet_size;
     /* packets sent through this channel */
diff --git a/migration/multifd.c b/migration/multifd.c
index 427cbe2ceb..3f9d9e3a56 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -587,7 +587,7 @@ void multifd_send_sync_main(QEMUFile *f)
         }
 
         p->packet_num = multifd_send_state->packet_num++;
-        p->flags |= MULTIFD_FLAG_SYNC;
+        p->sync_needed = true;
         p->pending_job++;
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_post(&p->sem);
@@ -627,7 +627,11 @@ static void *multifd_send_thread(void *opaque)
 
         if (p->pending_job) {
             uint64_t packet_num = p->packet_num;
-            uint32_t flags = p->flags;
+            p->flags = 0;
+            if (p->sync_needed) {
+                p->flags |= MULTIFD_FLAG_SYNC;
+                p->sync_needed = false;
+            }
             p->iovs_num = 1;
             p->normal_num = 0;
 
@@ -644,14 +648,13 @@ static void *multifd_send_thread(void *opaque)
                 }
             }
             multifd_send_fill_packet(p);
-            p->flags = 0;
             p->num_packets++;
             p->total_normal_pages += p->normal_num;
             p->pages->num = 0;
             p->pages->block = NULL;
             qemu_mutex_unlock(&p->mutex);
 
-            trace_multifd_send(p->id, packet_num, p->normal_num, flags,
+            trace_multifd_send(p->id, packet_num, p->normal_num, p->flags,
                                p->next_packet_size);
 
             p->iov[0].iov_len = p->packet_len;
@@ -669,7 +672,7 @@ static void *multifd_send_thread(void *opaque)
             p->pending_job--;
             qemu_mutex_unlock(&p->mutex);
 
-            if (flags & MULTIFD_FLAG_SYNC) {
+            if (p->flags & MULTIFD_FLAG_SYNC) {
                 qemu_sem_post(&p->sem_sync);
             }
             qemu_sem_post(&multifd_send_state->channels_ready);
@@ -1042,7 +1045,7 @@ static void *multifd_recv_thread(void *opaque)
     rcu_register_thread();
 
     while (true) {
-        uint32_t flags;
+        bool sync_needed = false;
 
         if (p->quit) {
             break;
@@ -1064,11 +1067,11 @@ static void *multifd_recv_thread(void *opaque)
             break;
         }
 
-        flags = p->flags;
+        trace_multifd_recv(p->id, p->packet_num, p->normal_num, p->flags,
+                           p->next_packet_size);
+        sync_needed = p->flags & MULTIFD_FLAG_SYNC;
         /* recv methods don't know how to handle the SYNC flag */
         p->flags &= ~MULTIFD_FLAG_SYNC;
-        trace_multifd_recv(p->id, p->packet_num, p->normal_num, flags,
-                           p->next_packet_size);
         p->num_packets++;
         p->total_normal_pages += p->normal_num;
         qemu_mutex_unlock(&p->mutex);
@@ -1080,7 +1083,7 @@ static void *multifd_recv_thread(void *opaque)
             }
         }
 
-        if (flags & MULTIFD_FLAG_SYNC) {
+        if (sync_needed) {
             qemu_sem_post(&multifd_recv_state->sem_sync);
             qemu_sem_wait(&p->sem_sync);
         }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 08/13] multifd: Prepare to send a packet without the mutex held
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (6 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 07/13] multifd: Make flags field thread local Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 09/13] multifd: Add property to enable/disable zero_page Juan Quintela
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

We do the send_prepare() and the fill of the head packet without the
mutex held.  It will help a lot for compression and later in the
series for zero pages.

Notice that we can use p->pages without holding p->mutex because
p->pending_job == 1.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h |  2 ++
 migration/multifd.c | 11 ++++++-----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index ecf5a8e868..ed81f249d7 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -107,7 +107,9 @@ typedef struct {
     /* array of pages to sent.
      * The owner of 'pages' depends of 'pending_job' value:
      * pending_job == 0 -> migration_thread can use it.
+     *                     No need for mutex lock.
      * pending_job != 0 -> multifd_channel can use it.
+     *                     No need for mutex lock.
      */
     MultiFDPages_t *pages;
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 3f9d9e3a56..c8f2caa2ae 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -632,6 +632,8 @@ static void *multifd_send_thread(void *opaque)
                 p->flags |= MULTIFD_FLAG_SYNC;
                 p->sync_needed = false;
             }
+            qemu_mutex_unlock(&p->mutex);
+
             p->iovs_num = 1;
             p->normal_num = 0;
 
@@ -648,11 +650,6 @@ static void *multifd_send_thread(void *opaque)
                 }
             }
             multifd_send_fill_packet(p);
-            p->num_packets++;
-            p->total_normal_pages += p->normal_num;
-            p->pages->num = 0;
-            p->pages->block = NULL;
-            qemu_mutex_unlock(&p->mutex);
 
             trace_multifd_send(p->id, packet_num, p->normal_num, p->flags,
                                p->next_packet_size);
@@ -667,6 +664,10 @@ static void *multifd_send_thread(void *opaque)
             }
 
             qemu_mutex_lock(&p->mutex);
+            p->num_packets++;
+            p->total_normal_pages += p->normal_num;
+            p->pages->num = 0;
+            p->pages->block = NULL;
             p->sent_bytes += p->packet_len;;
             p->sent_bytes += p->next_packet_size;
             p->pending_job--;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 09/13] multifd: Add property to enable/disable zero_page
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (7 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 08/13] multifd: Prepare to send a packet without the mutex held Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 10/13] migration: Export ram_release_page() Juan Quintela
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/migration.h |  3 +++
 hw/core/machine.c     |  4 +++-
 migration/migration.c | 11 +++++++++++
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/migration/migration.h b/migration/migration.h
index a863032b71..068e66ca9a 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -332,6 +332,8 @@ struct MigrationState {
      * This save hostname when out-going migration starts
      */
     char *hostname;
+    /* Use multifd channel to send zero pages */
+    bool multifd_zero_pages;
 };
 
 void migrate_set_state(int *state, int old_state, int new_state);
@@ -374,6 +376,7 @@ int migrate_multifd_channels(void);
 MultiFDCompression migrate_multifd_compression(void);
 int migrate_multifd_zlib_level(void);
 int migrate_multifd_zstd_level(void);
+bool migrate_use_multifd_zero_page(void);
 
 int migrate_use_xbzrle(void);
 uint64_t migrate_xbzrle_cache_size(void);
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 700c1e76b8..d02977d5df 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -37,7 +37,9 @@
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-pci.h"
 
-GlobalProperty hw_compat_7_0[] = {};
+GlobalProperty hw_compat_7_0[] = {
+    { "migration", "multifd-zero-pages", "false" },
+};
 const size_t hw_compat_7_0_len = G_N_ELEMENTS(hw_compat_7_0);
 
 GlobalProperty hw_compat_6_2[] = {
diff --git a/migration/migration.c b/migration/migration.c
index 5a31b23bd6..7e591990ef 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2517,6 +2517,15 @@ bool migrate_use_multifd(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD];
 }
 
+bool migrate_use_multifd_zero_page(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->multifd_zero_pages;
+}
+
 bool migrate_pause_before_switchover(void)
 {
     MigrationState *s;
@@ -4164,6 +4173,8 @@ static Property migration_properties[] = {
                       clear_bitmap_shift, CLEAR_BITMAP_SHIFT_DEFAULT),
 
     /* Migration parameters */
+    DEFINE_PROP_BOOL("multifd-zero-pages", MigrationState,
+                      multifd_zero_pages, true),
     DEFINE_PROP_UINT8("x-compress-level", MigrationState,
                       parameters.compress_level,
                       DEFAULT_MIGRATE_COMPRESS_LEVEL),
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 10/13] migration: Export ram_release_page()
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (8 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 09/13] multifd: Add property to enable/disable zero_page Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 11/13] multifd: Support for zero pages transmission Juan Quintela
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.h | 1 +
 migration/ram.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/migration/ram.h b/migration/ram.h
index 7b641adc55..aee08de2a5 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -66,6 +66,7 @@ int ram_load_postcopy(QEMUFile *f);
 void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
 
 void ram_transferred_add(uint64_t bytes);
+void ram_release_page(const char *rbname, uint64_t offset);
 
 int ramblock_recv_bitmap_test(RAMBlock *rb, void *host_addr);
 bool ramblock_recv_bitmap_test_byte_offset(RAMBlock *rb, uint64_t byte_offset);
diff --git a/migration/ram.c b/migration/ram.c
index 14269a2671..34da0d71df 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1182,7 +1182,7 @@ static void migration_bitmap_sync_precopy(RAMState *rs)
     }
 }
 
-static void ram_release_page(const char *rbname, uint64_t offset)
+void ram_release_page(const char *rbname, uint64_t offset)
 {
     if (!migrate_release_ram() || !migration_in_postcopy()) {
         return;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 11/13] multifd: Support for zero pages transmission
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (9 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 10/13] migration: Export ram_release_page() Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 12/13] multifd: Zero " Juan Quintela
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

This patch adds counters and similar.  Logic will be added on the
following patch.

Signed-off-by: Juan Quintela <quintela@redhat.com>

---

Added counters for duplicated/non duplicated pages.
Removed reviewed by from David.
Add total_zero_pages
---
 migration/multifd.h    | 17 ++++++++++++++++-
 migration/multifd.c    | 36 +++++++++++++++++++++++++++++-------
 migration/ram.c        |  2 --
 migration/trace-events |  8 ++++----
 4 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index ed81f249d7..3bb33eb4af 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -47,7 +47,10 @@ typedef struct {
     /* size of the next packet that contains pages */
     uint32_t next_packet_size;
     uint64_t packet_num;
-    uint64_t unused[4];    /* Reserved for future use */
+    /* zero pages */
+    uint32_t zero_pages;
+    uint32_t unused32[1];    /* Reserved for future use */
+    uint64_t unused64[3];    /* Reserved for future use */
     char ramblock[256];
     uint64_t offset[];
 } __attribute__((packed)) MultiFDPacket_t;
@@ -125,6 +128,8 @@ typedef struct {
     uint64_t num_packets;
     /* non zero pages sent through this channel */
     uint64_t total_normal_pages;
+    /* zero pages sent through this channel */
+    uint64_t total_zero_pages;
     /* buffers to send */
     struct iovec *iov;
     /* number of iovs used */
@@ -133,6 +138,10 @@ typedef struct {
     ram_addr_t *normal;
     /* num of non zero pages */
     uint32_t normal_num;
+    /* Pages that are  zero */
+    ram_addr_t *zero;
+    /* num of zero pages */
+    uint32_t zero_num;
     /* used for compression methods */
     void *data;
 }  MultiFDSendParams;
@@ -182,12 +191,18 @@ typedef struct {
     uint8_t *host;
     /* non zero pages recv through this channel */
     uint64_t total_normal_pages;
+    /* zero pages recv through this channel */
+    uint64_t total_zero_pages;
     /* buffers to recv */
     struct iovec *iov;
     /* Pages that are not zero */
     ram_addr_t *normal;
     /* num of non zero pages */
     uint32_t normal_num;
+    /* Pages that are  zero */
+    ram_addr_t *zero;
+    /* num of zero pages */
+    uint32_t zero_num;
     /* used for de-compression methods */
     void *data;
 } MultiFDRecvParams;
diff --git a/migration/multifd.c b/migration/multifd.c
index c8f2caa2ae..15a16686db 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -263,6 +263,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
     packet->normal_pages = cpu_to_be32(p->normal_num);
     packet->next_packet_size = cpu_to_be32(p->next_packet_size);
     packet->packet_num = cpu_to_be64(p->packet_num);
+    packet->zero_pages = cpu_to_be32(p->zero_num);
 
     if (p->pages->block) {
         strncpy(packet->ramblock, p->pages->block->idstr, 256);
@@ -323,7 +324,15 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
     p->next_packet_size = be32_to_cpu(packet->next_packet_size);
     p->packet_num = be64_to_cpu(packet->packet_num);
 
-    if (p->normal_num == 0) {
+    p->zero_num = be32_to_cpu(packet->zero_pages);
+    if (p->zero_num > packet->pages_alloc - p->normal_num) {
+        error_setg(errp, "multifd: received packet "
+                   "with %u zero pages and expected maximum pages are %u",
+                   p->zero_num, packet->pages_alloc - p->normal_num) ;
+        return -1;
+    }
+
+    if (p->normal_num == 0 && p->zero_num == 0) {
         return 0;
     }
 
@@ -432,6 +441,8 @@ static int multifd_send_pages(QEMUFile *f)
     ram_counters.multifd_bytes += p->sent_bytes;
     qemu_file_update_transfer(f, p->sent_bytes);
     p->sent_bytes = 0;
+    ram_counters.normal += p->normal_num;
+    ram_counters.duplicate += p->zero_num;
     qemu_mutex_unlock(&p->mutex);
     qemu_sem_post(&p->sem);
 
@@ -545,6 +556,8 @@ void multifd_save_cleanup(void)
         p->iov = NULL;
         g_free(p->normal);
         p->normal = NULL;
+        g_free(p->zero);
+        p->zero = NULL;
         multifd_send_state->ops->send_cleanup(p, &local_err);
         if (local_err) {
             migrate_set_error(migrate_get_current(), local_err);
@@ -636,6 +649,7 @@ static void *multifd_send_thread(void *opaque)
 
             p->iovs_num = 1;
             p->normal_num = 0;
+            p->zero_num = 0;
 
             for (int i = 0; i < p->pages->num; i++) {
                 p->normal[p->normal_num] = p->pages->offset[i];
@@ -651,8 +665,8 @@ static void *multifd_send_thread(void *opaque)
             }
             multifd_send_fill_packet(p);
 
-            trace_multifd_send(p->id, packet_num, p->normal_num, p->flags,
-                               p->next_packet_size);
+            trace_multifd_send(p->id, packet_num, p->normal_num, p->zero_num,
+                               p->flags, p->next_packet_size);
 
             p->iov[0].iov_len = p->packet_len;
             p->iov[0].iov_base = p->packet;
@@ -666,6 +680,7 @@ static void *multifd_send_thread(void *opaque)
             qemu_mutex_lock(&p->mutex);
             p->num_packets++;
             p->total_normal_pages += p->normal_num;
+            p->total_zero_pages += p->zero_num;
             p->pages->num = 0;
             p->pages->block = NULL;
             p->sent_bytes += p->packet_len;;
@@ -707,7 +722,8 @@ out:
     qemu_mutex_unlock(&p->mutex);
 
     rcu_unregister_thread();
-    trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages);
+    trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages,
+                                  p->total_zero_pages);
 
     return NULL;
 }
@@ -897,6 +913,7 @@ int multifd_save_setup(Error **errp)
         p->normal = g_new0(ram_addr_t, page_count);
         p->page_size = qemu_target_page_size();
         p->page_count = page_count;
+        p->zero = g_new0(ram_addr_t, page_count);
         socket_send_channel_create(multifd_new_send_channel_async, p);
     }
 
@@ -998,6 +1015,8 @@ int multifd_load_cleanup(Error **errp)
         p->iov = NULL;
         g_free(p->normal);
         p->normal = NULL;
+        g_free(p->zero);
+        p->zero = NULL;
         multifd_recv_state->ops->recv_cleanup(p);
     }
     qemu_sem_destroy(&multifd_recv_state->sem_sync);
@@ -1068,13 +1087,14 @@ static void *multifd_recv_thread(void *opaque)
             break;
         }
 
-        trace_multifd_recv(p->id, p->packet_num, p->normal_num, p->flags,
-                           p->next_packet_size);
+        trace_multifd_recv(p->id, p->packet_num, p->normal_num, p->zero_num,
+                           p->flags, p->next_packet_size);
         sync_needed = p->flags & MULTIFD_FLAG_SYNC;
         /* recv methods don't know how to handle the SYNC flag */
         p->flags &= ~MULTIFD_FLAG_SYNC;
         p->num_packets++;
         p->total_normal_pages += p->normal_num;
+        p->total_normal_pages += p->zero_num;
         qemu_mutex_unlock(&p->mutex);
 
         if (p->normal_num) {
@@ -1099,7 +1119,8 @@ static void *multifd_recv_thread(void *opaque)
     qemu_mutex_unlock(&p->mutex);
 
     rcu_unregister_thread();
-    trace_multifd_recv_thread_end(p->id, p->num_packets, p->total_normal_pages);
+    trace_multifd_recv_thread_end(p->id, p->num_packets, p->total_normal_pages,
+                                  p->total_zero_pages);
 
     return NULL;
 }
@@ -1139,6 +1160,7 @@ int multifd_load_setup(Error **errp)
         p->normal = g_new0(ram_addr_t, page_count);
         p->page_count = page_count;
         p->page_size = qemu_target_page_size();
+        p->zero = g_new0(ram_addr_t, page_count);
     }
 
     for (i = 0; i < thread_count; i++) {
diff --git a/migration/ram.c b/migration/ram.c
index 34da0d71df..0a91b87bd2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1356,8 +1356,6 @@ static int ram_save_multifd_page(RAMState *rs, RAMBlock *block,
     if (multifd_queue_page(rs->f, block, offset) < 0) {
         return -1;
     }
-    ram_counters.normal++;
-
     return 1;
 }
 
diff --git a/migration/trace-events b/migration/trace-events
index 1aec580e92..d70e89dbb9 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -114,21 +114,21 @@ unqueue_page(char *block, uint64_t offset, bool dirty) "ramblock '%s' offset 0x%
 
 # multifd.c
 multifd_new_send_channel_async(uint8_t id) "channel %u"
-multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " pages %u flags 0x%x next packet size %u"
+multifd_recv(uint8_t id, uint64_t packet_num, uint32_t normal, uint32_t zero, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u zero pages %u flags 0x%x next packet size %u"
 multifd_recv_new_channel(uint8_t id) "channel %u"
 multifd_recv_sync_main(long packet_num) "packet num %ld"
 multifd_recv_sync_main_signal(uint8_t id) "channel %u"
 multifd_recv_sync_main_wait(uint8_t id) "channel %u"
 multifd_recv_terminate_threads(bool error) "error %d"
-multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel %u packets %" PRIu64 " pages %" PRIu64
+multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages, uint64_t zero_pages) "channel %u packets %" PRIu64 " normal pages %" PRIu64 " zero pages %" PRIu64
 multifd_recv_thread_start(uint8_t id) "%u"
-multifd_send(uint8_t id, uint64_t packet_num, uint32_t normal, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u flags 0x%x next packet size %u"
+multifd_send(uint8_t id, uint64_t packet_num, uint32_t normalpages, uint32_t zero_pages, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u zero pages %u flags 0x%x next packet size %u"
 multifd_send_error(uint8_t id) "channel %u"
 multifd_send_sync_main(long packet_num) "packet num %ld"
 multifd_send_sync_main_signal(uint8_t id) "channel %u"
 multifd_send_sync_main_wait(uint8_t id) "channel %u"
 multifd_send_terminate_threads(bool error) "error %d"
-multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages) "channel %u packets %" PRIu64 " normal pages %"  PRIu64
+multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages, uint64_t zero_pages) "channel %u packets %" PRIu64 " normal pages %"  PRIu64 " zero pages %"  PRIu64
 multifd_send_thread_start(uint8_t id) "%u"
 multifd_tls_outgoing_handshake_start(void *ioc, void *tioc, const char *hostname) "ioc=%p tioc=%p hostname=%s"
 multifd_tls_outgoing_handshake_error(void *ioc, const char *err) "ioc=%p err=%s"
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 12/13] multifd: Zero pages transmission
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (10 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 11/13] multifd: Support for zero pages transmission Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-10 22:42 ` [PATCH v6 13/13] migration: Use multifd before we check for the zero page Juan Quintela
  2022-05-12 13:40 ` [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Dr. David Alan Gilbert
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

This implements the zero page dection and handling.

Signed-off-by: Juan Quintela <quintela@redhat.com>

---

Add comment for offset (dave)
Use local variables for offset/block to have shorter lines
---
 migration/multifd.h |  5 +++++
 migration/multifd.c | 41 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 3bb33eb4af..b885390116 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -52,6 +52,11 @@ typedef struct {
     uint32_t unused32[1];    /* Reserved for future use */
     uint64_t unused64[3];    /* Reserved for future use */
     char ramblock[256];
+    /*
+     * This array contains the pointers to:
+     *  - normal pages (initial normal_pages entries)
+     *  - zero pages (following zero_pages entries)
+     */
     uint64_t offset[];
 } __attribute__((packed)) MultiFDPacket_t;
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 15a16686db..aa012634cf 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -11,6 +11,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/cutils.h"
 #include "qemu/rcu.h"
 #include "exec/target_page.h"
 #include "sysemu/sysemu.h"
@@ -275,6 +276,12 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
 
         packet->offset[i] = cpu_to_be64(temp);
     }
+    for (i = 0; i < p->zero_num; i++) {
+        /* there are architectures where ram_addr_t is 32 bit */
+        uint64_t temp = p->zero[i];
+
+        packet->offset[p->normal_num + i] = cpu_to_be64(temp);
+    }
 }
 
 static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
@@ -358,6 +365,18 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
         p->normal[i] = offset;
     }
 
+    for (i = 0; i < p->zero_num; i++) {
+        uint64_t offset = be64_to_cpu(packet->offset[p->normal_num + i]);
+
+        if (offset > (block->used_length - p->page_size)) {
+            error_setg(errp, "multifd: offset too long %" PRIu64
+                       " (max " RAM_ADDR_FMT ")",
+                       offset, block->used_length);
+            return -1;
+        }
+        p->zero[i] = offset;
+    }
+
     return 0;
 }
 
@@ -618,6 +637,8 @@ static void *multifd_send_thread(void *opaque)
 {
     MultiFDSendParams *p = opaque;
     Error *local_err = NULL;
+    /* qemu older than 7.0 don't understand zero page on multifd channel */
+    bool use_zero_page = migrate_use_multifd_zero_page();
     int ret = 0;
 
     trace_multifd_send_thread_start(p->id);
@@ -639,6 +660,7 @@ static void *multifd_send_thread(void *opaque)
         qemu_mutex_lock(&p->mutex);
 
         if (p->pending_job) {
+            RAMBlock *rb = p->pages->block;
             uint64_t packet_num = p->packet_num;
             p->flags = 0;
             if (p->sync_needed) {
@@ -652,8 +674,16 @@ static void *multifd_send_thread(void *opaque)
             p->zero_num = 0;
 
             for (int i = 0; i < p->pages->num; i++) {
-                p->normal[p->normal_num] = p->pages->offset[i];
-                p->normal_num++;
+                uint64_t offset = p->pages->offset[i];
+                if (use_zero_page &&
+                    buffer_is_zero(rb->host + offset, p->page_size)) {
+                    p->zero[p->zero_num] = offset;
+                    p->zero_num++;
+                    ram_release_page(rb->idstr, offset);
+                } else {
+                    p->normal[p->normal_num] = offset;
+                    p->normal_num++;
+                }
             }
 
             if (p->normal_num) {
@@ -1104,6 +1134,13 @@ static void *multifd_recv_thread(void *opaque)
             }
         }
 
+        for (int i = 0; i < p->zero_num; i++) {
+            void *page = p->host + p->zero[i];
+            if (!buffer_is_zero(page, p->page_size)) {
+                memset(page, 0, p->page_size);
+            }
+        }
+
         if (sync_needed) {
             qemu_sem_post(&multifd_recv_state->sem_sync);
             qemu_sem_wait(&p->sem_sync);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 13/13] migration: Use multifd before we check for the zero page
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (11 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 12/13] multifd: Zero " Juan Quintela
@ 2022-05-10 22:42 ` Juan Quintela
  2022-05-12 13:40 ` [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Dr. David Alan Gilbert
  13 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-10 22:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

So we use multifd to transmit zero pages.

Signed-off-by: Juan Quintela <quintela@redhat.com>

---

- Check zero_page property before using new code (Dave)
---
 migration/ram.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 0a91b87bd2..64e45ba915 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2181,6 +2181,32 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
     return ram_save_page(rs, pss);
 }
 
+/**
+ * ram_save_target_page_multifd: save one target page
+ *
+ * Returns the number of pages written
+ *
+ * @rs: current RAM state
+ * @pss: data about the page we want to send
+ */
+static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss)
+{
+    RAMBlock *block = pss->block;
+    ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
+    int res;
+
+    if (!migration_in_postcopy()) {
+        return ram_save_multifd_page(rs, block, offset);
+    }
+
+    res = save_zero_page(rs, block, offset);
+    if (res > 0) {
+        return res;
+    }
+
+    return ram_save_page(rs, pss);
+}
+
 /**
  * ram_save_host_page: save a whole host page
  *
@@ -2944,7 +2970,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     ram_control_before_iterate(f, RAM_CONTROL_SETUP);
     ram_control_after_iterate(f, RAM_CONTROL_SETUP);
 
-    (*rsp)->ram_save_target_page = ram_save_target_page_legacy;
+    if (migrate_use_multifd() && migrate_use_multifd_zero_page()) {
+        (*rsp)->ram_save_target_page = ram_save_target_page_multifd;
+    } else {
+        (*rsp)->ram_save_target_page = ram_save_target_page_legacy;
+    }
     multifd_send_sync_main(f);
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     qemu_fflush(f);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads
  2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
                   ` (12 preceding siblings ...)
  2022-05-10 22:42 ` [PATCH v6 13/13] migration: Use multifd before we check for the zero page Juan Quintela
@ 2022-05-12 13:40 ` Dr. David Alan Gilbert
  2022-05-16 10:45   ` Juan Quintela
  13 siblings, 1 reply; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2022-05-12 13:40 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

* Juan Quintela (quintela@redhat.com) wrote:
> In this version:
> - document what protects each field in MultiFDRecv/SendParams
> - calcule page_size once when we start the migration, and store it in
>   a field
> - Same for page_count.
> - rebase to latest
> - minor improvements here and there
> - test on huge memory machines
> 
> Command line for all the tests:
> 
> gdb -q --ex "run" --args $QEMU \
> 	-name guest=$NAME,debug-threads=on \
> 	-m 16G \
> 	-smp 6 \
> 	-machine q35,accel=kvm,usb=off,dump-guest-core=off \
> 	-boot strict=on \
> 	-cpu host \
> 	-no-hpet \
> 	-rtc base=utc,driftfix=slew \
> 	-global kvm-pit.lost_tick_policy=delay \
> 	-global ICH9-LPC.disable_s3=1 \
> 	-global ICH9-LPC.disable_s4=1 \
> 	-device pcie-root-port,id=root.1,chassis=1,addr=0x2.0,multifunction=on \
> 	-device pcie-root-port,id=root.2,chassis=2,addr=0x2.1 \
> 	-device pcie-root-port,id=root.3,chassis=3,addr=0x2.2 \
> 	-device pcie-root-port,id=root.4,chassis=4,addr=0x2.3 \
> 	-device pcie-root-port,id=root.5,chassis=5,addr=0x2.4 \
> 	-device pcie-root-port,id=root.6,chassis=6,addr=0x2.5 \
> 	-device pcie-root-port,id=root.7,chassis=7,addr=0x2.6 \
> 	-device pcie-root-port,id=root.8,chassis=8,addr=0x2.7 \
> 	-blockdev driver=file,node-name=storage0,filename=$FILE,auto-read-only=true,discard=unmap \
> 	-blockdev driver=qcow2,node-name=format0,read-only=false,file=storage0 \
> 	-device virtio-blk-pci,id=virtio-disk0,drive=format0,bootindex=1,bus=root.1 \
> 	-netdev tap,id=hostnet0,vhost=on,script=/etc/kvm-ifup,downscript=/etc/kvm-ifdown \
> 	-device virtio-net-pci,id=net0,netdev=hostnet0,mac=$MAC,bus=root.2 \
> 	-device virtio-serial-pci,id=virtio-serial0,bus=root.3 \
> 	-device virtio-balloon-pci,id=balloon0,bus=root.4 \
> 	$GRAPHICS \
> 	$CONSOLE \
> 	-device virtconsole,id=console0,chardev=charconsole0 \
> 	-uuid 9d3be7da-e1ff-41a0-ac39-8b2e04de2c19 \
> 	-nodefaults \
> 	-msg timestamp=on \
> 	-no-user-config \
> 	$MONITOR \
> 	$TRACE \
> 	-global migration.x-multifd=on \
> 	-global migration.multifd-channels=16 \
> 	-global migration.x-max-bandwidth=$BANDWIDTH
> 
> Tests have been done in a single machine over localhost.  I didn't have 2 machines with 4TB of RAM for testing.
> 
> Tests done on a 12TB RAM machine.  Guests where running with 16GB, 1TB and 4TB RAM
> 
> tests run with:
> - upstream multifd
> - multifd + zero page
> - precopy (only some of them)
> 
> tests done:
> - idle clean guest (just booted guest)
> - idle dirty guest (run a program to dirty all memory)
> - test with stress (4 threads each dirtying 1GB RAM)
> 
> Executive summary
> 
> 16GB guest
>                 Precopy            upstream          zero page
>                 Time    Downtime   Time    Downtime  Time    Downtime
> clean idle      1548     93         1359   48         866    167
                                           866/1359 = 64%
> dirty idle     16222    220         2092   371       1870    258
                                           1870/2092 = 89%
> busy 4GB       don't converge      31000   308       1604    371
> 
> In the dirty idle, there is some weirdness in the precopy case, I
> tried several times and it always took too much time.  It should be
> faster.
> 
> In the busy 4GB case, precopy don't converge (expected) and without
> zero page, multifd is on the limit, it _almost_ don't convrge, it took
> 187 iterations to converge.
> 
> 1TB
>                 Precopy            upstream          zero page
>                 Time    Downtime   Time    Downtime  Time    Downtime
> clean idle     83174    381        72075   345       52966   273
                                          52966/72075=74%
> dirty idle                        104587   381       75601   269
                                          75601/104587=72%
> busy 2GB                           79912   345       58953   348
> 
> I only tried the clean idle case with 1TB.  Notice that it is already
> significantively slower.  With 1TB RAM, zero page is clearly superior in all tests.
> 
> 4TB
>                 upstream          zero page
>                 Time    Downtime  Time    Downtime
> clean idle      317054  552       215567  500
                215567/317054 = 68%
> dirty idle      357581  553       317428  744
                317428/357581 = 89%

The 1TB dirty/idle is a bit of an unusual outlier at 72% time; but still
the 89% on the 16GB/4TB dirty case is still a useful improvement - I wasn't
expecting the dirty case to be as good - I wonder if there's some side
benefit, like meaning the page is only read by the data threads and not
also read by the main thread so only in one cache?

(the 10% improvement on the dirty case is more important to me than the
more impressive number for the clean case)

Dave

> The busy case here is similar to the 1TB guests, just takes much more time.
> 
> In conclusion, zero page detection on the migration threads is from a
> bit to much faster than anything else.
> 
> I add here the output of info migrate and perf for all the migration
> rounds.  The important bit that I found is that once that we introduce
> zero pages, migration spends all its time copyng pages, that is where
> it needs to be, not waiting for buffer_zero or similar.
> 
> Upstream
> --------
> 
> 16GB test
> 
> idle
> 
> precopy
> 
> Migration status: completed
> total time: 1548 ms
> downtime: 93 ms
> setup: 16 ms
> transferred ram: 624798 kbytes
> throughput: 3343.01 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 4048839 pages
> skipped: 0 pages
> normal: 147016 pages
> normal bytes: 588064 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 0 kbytes
> pages-per-second: 651825
> precopy ram: 498490 kbytes
> downtime ram: 126307 kbytes
> 
>   41.76%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>   14.68%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    9.53%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    5.72%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    3.89%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.50%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.45%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    1.87%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    1.28%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.03%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    0.95%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    0.95%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.68%  live_migration   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.67%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
>    0.56%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.51%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
>    0.43%  live_migration   [kernel.kallsyms]        [k] copy_page
>    0.38%  live_migration   qemu-system-x86_64       [.] get_ptr_rcu_reader
>    0.36%  live_migration   qemu-system-x86_64       [.] save_page_header
>    0.33%  live_migration   [kernel.kallsyms]        [k] __memcg_kmem_charge_page
>    0.33%  live_migration   qemu-system-x86_64       [.] runstate_is_running
> 
> upstream
> 
> Migration status: completed
> total time: 1359 ms
> downtime: 48 ms
> setup: 35 ms
> transferred ram: 603701 kbytes
> throughput: 3737.66 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 4053362 pages
> skipped: 0 pages
> normal: 141517 pages
> normal bytes: 566068 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 568076 kbytes
> pages-per-second: 2039403
> precopy ram: 35624 kbytes
> downtime ram: 1 kbytes
> 
>   36.03%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    9.32%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    5.18%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    4.15%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.60%  live_migration   [kernel.kallsyms]        [k] copy_page
>    2.30%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.24%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    1.96%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    1.30%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.12%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.00%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.94%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    0.93%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    0.91%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.88%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.88%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    0.81%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
>    0.81%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.79%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.75%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.72%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    0.70%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.70%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.70%  qemu-system-x86  [kernel.kallsyms]        [k] perf_event_alloc
>    0.69%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.68%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.67%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.66%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.64%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.63%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.63%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.60%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.53%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.47%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
> 
> zero page
> 
> Migration status: completed
> total time: 866 ms
> downtime: 167 ms
> setup: 42 ms
> transferred ram: 14627983 kbytes
> throughput: 145431.53 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 4024050 pages
> skipped: 0 pages
> normal: 143374 pages
> normal bytes: 573496 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 14627983 kbytes
> pages-per-second: 4786693
> precopy ram: 11033626 kbytes
> downtime ram: 3594356 kbytes
> 
>    6.84%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    4.06%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    3.46%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.39%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    1.59%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.50%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.48%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512
>    1.32%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512
>    1.29%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.25%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.24%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.20%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.20%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512
>    1.18%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    1.16%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.13%  live_migration   qemu-system-x86_64       [.] multifd_queue_page
>    1.08%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.06%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.94%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.92%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.91%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.90%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
> 
> 16GB guest
> 
> dirty
> 
> precopy
> 
> Migration status: completed
> total time: 16222 ms
> downtime: 220 ms
> setup: 18 ms
> transferred ram: 15927448 kbytes
> throughput: 8052.38 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 222804 pages
> skipped: 0 pages
> normal: 3973611 pages
> normal bytes: 15894444 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 0 kbytes
> pages-per-second: 241728
> precopy ram: 15670253 kbytes
> downtime ram: 257194 kbytes
> 
>   38.22%  live_migration   [kernel.kallsyms]        [k] copy_page
>   38.04%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.55%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    2.45%  live_migration   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    1.43%  live_migration   [kernel.kallsyms]        [k] free_pcp_prepare
>    1.01%  live_migration   [kernel.kallsyms]        [k] _copy_from_iter
>    0.79%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    0.79%  live_migration   [kernel.kallsyms]        [k] __list_del_entry_valid
>    0.68%  live_migration   [kernel.kallsyms]        [k] check_new_pages
>    0.64%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    0.49%  live_migration   [kernel.kallsyms]        [k] skb_release_data
>    0.39%  live_migration   [kernel.kallsyms]        [k] __skb_clone
>    0.36%  live_migration   [kernel.kallsyms]        [k] total_mapcount
>    0.34%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.32%  live_migration   [kernel.kallsyms]        [k] __dev_queue_xmit
>    0.29%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    0.29%  live_migration   [kernel.kallsyms]        [k] __alloc_skb
>    0.27%  live_migration   [kernel.kallsyms]        [k] __ip_queue_xmit
>    0.26%  live_migration   [kernel.kallsyms]        [k] copy_user_generic_unrolled
>    0.26%  live_migration   [kernel.kallsyms]        [k] __tcp_transmit_skb
>    0.24%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    0.24%  live_migration   [kernel.kallsyms]        [k] skb_page_frag_refill
> 
> upstream
> 
> Migration status: completed
> total time: 2092 ms
> downtime: 371 ms
> setup: 39 ms
> transferred ram: 15929157 kbytes
> throughput: 63562.98 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 224436 pages
> skipped: 0 pages
> normal: 3971430 pages
> normal bytes: 15885720 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 15927184 kbytes
> pages-per-second: 2441771
> precopy ram: 1798 kbytes
> downtime ram: 174 kbytes
> 
>   5.23%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    4.93%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.92%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.84%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.56%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.55%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.53%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.48%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.43%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.43%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.33%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.21%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.19%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.13%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.01%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.86%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.83%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.90%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    0.70%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.69%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    0.62%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    0.37%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.29%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    0.27%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
> 
> zero page
> 
> Migration status: completed
> total time: 1870 ms
> downtime: 258 ms
> setup: 36 ms
> transferred ram: 16998097 kbytes
> throughput: 75927.79 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 222485 pages
> skipped: 0 pages
> normal: 3915115 pages
> normal bytes: 15660460 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 16998097 kbytes
> pages-per-second: 2555169
> precopy ram: 13929973 kbytes
> downtime ram: 3068124 kbytes
> 
>    4.66%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.60%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.49%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.39%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.36%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.21%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.20%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.18%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.17%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.07%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.97%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.96%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.89%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.73%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.68%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.44%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.52%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    2.09%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    1.03%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    0.97%  multifdsend_3    [kernel.kallsyms]        [k] copy_page
>    0.94%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    0.79%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl
>    0.73%  multifdsend_11   [kernel.kallsyms]        [k] copy_page
>    0.70%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
>    0.45%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.41%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
> 
> 16GB guest
> 
> stress --vm 4 --vm-bytes 1G --vm-keep
> 
> precopy
> 
> Don't converge
> 
> upstream
> 
> Migration status: completed
> total time: 31800 ms
> downtime: 308 ms
> setup: 40 ms
> transferred ram: 295540640 kbytes
> throughput: 76230.23 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 3006674 pages
> skipped: 0 pages
> normal: 73686367 pages
> normal bytes: 294745468 kbytes
> dirty sync count: 187
> page size: 4 kbytes
> multifd bytes: 295514209 kbytes
> pages-per-second: 2118000
> precopy ram: 26430 kbytes
> 
>   7.79%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    3.86%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.83%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.79%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.72%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.46%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.44%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.38%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.32%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.31%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.22%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.21%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.19%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.07%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.95%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.95%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.77%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.78%  live_migration   [kernel.kallsyms]        [k] kvm_set_pfn_dirty
>    1.65%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    0.68%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.62%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.46%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    0.41%  live_migration   [kernel.kallsyms]        [k] __handle_changed_spte
>    0.40%  live_migration   [kernel.kallsyms]        [k] pfn_valid.part.0
>    0.37%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
>    0.29%  CPU 2/KVM        [kernel.kallsyms]        [k] copy_page
>    0.27%  live_migration   [kernel.kallsyms]        [k] clear_dirty_pt_masked
>    0.27%  CPU 1/KVM        [kernel.kallsyms]        [k] copy_page
>    0.26%  live_migration   [kernel.kallsyms]        [k] tdp_iter_next
>    0.25%  CPU 1/KVM        [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.24%  CPU 1/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
>    0.24%  CPU 2/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
> 
> Zero page
> 
> Migration status: completed
> total time: 1604 ms
> downtime: 371 ms
> setup: 32 ms
> transferred ram: 20591268 kbytes
> throughput: 107307.14 mbps
> remaining ram: 0 kbytes
> total ram: 16777992 kbytes
> duplicate: 2984825 pages
> skipped: 0 pages
> normal: 2213496 pages
> normal bytes: 8853984 kbytes
> dirty sync count: 4
> page size: 4 kbytes
> multifd bytes: 20591268 kbytes
> pages-per-second: 4659200
> precopy ram: 15722803 kbytes
> downtime ram: 4868465 kbytes
> 
>    3.21%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.92%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.86%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.81%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.80%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.79%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.78%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.73%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.73%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.69%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.62%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.60%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.59%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.58%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.55%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.38%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.44%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    1.41%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    1.37%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    0.80%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    0.78%  CPU 4/KVM        [kernel.kallsyms]        [k] _raw_read_lock
>    0.78%  CPU 2/KVM        [kernel.kallsyms]        [k] _raw_read_lock
>    0.77%  CPU 4/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
>    0.77%  CPU 2/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
>    0.76%  CPU 5/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
>    0.75%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    0.74%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.73%  CPU 5/KVM        [kernel.kallsyms]        [k] _raw_read_lock
>    0.67%  CPU 0/KVM        [kernel.kallsyms]        [k] copy_page
>    0.62%  CPU 0/KVM        [kernel.kallsyms]        [k] tdp_mmu_map_handle_target_level
>    0.62%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl
>    0.61%  CPU 0/KVM        [kernel.kallsyms]        [k] _raw_read_lock
>    0.60%  CPU 2/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
>    0.58%  CPU 5/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
>    0.54%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.53%  CPU 4/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
>    0.52%  CPU 0/KVM        [kernel.kallsyms]        [k] mark_page_dirty_in_slot.part.0
>    0.49%  live_migration   [kernel.kallsyms]        [k] kvm_set_pfn_dirty
> 
> 1TB guest
> 
> precopy
> 
> Migration status: completed
> total time: 83147 ms
> downtime: 381 ms
> setup: 265 ms
> transferred ram: 19565544 kbytes
> throughput: 1933.88 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 264135334 pages
> skipped: 0 pages
> normal: 4302604 pages
> normal bytes: 17210416 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 0 kbytes
> pages-per-second: 412882
> precopy ram: 19085615 kbytes
> downtime ram: 479929 kbytes
> 
>   43.50%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>   11.27%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    8.33%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    7.47%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    4.41%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    3.42%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    3.06%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.62%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    1.78%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.43%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.13%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    1.12%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    0.70%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.51%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
>    0.49%  live_migration   qemu-system-x86_64       [.] save_page_header
>    0.48%  live_migration   qemu-system-x86_64       [.] qemu_put_be64
>    0.40%  live_migration   qemu-system-x86_64       [.] migrate_postcopy_ram
>    0.40%  live_migration   qemu-system-x86_64       [.] runstate_is_running
>    0.35%  live_migration   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.32%  live_migration   qemu-system-x86_64       [.] get_ptr_rcu_reader
>    0.30%  live_migration   qemu-system-x86_64       [.] qemu_file_rate_limit
>    0.30%  live_migration   qemu-system-x86_64       [.] migrate_use_xbzrle
>    0.27%  live_migration   [kernel.kallsyms]        [k] __memcg_kmem_charge_page
>    0.26%  live_migration   qemu-system-x86_64       [.] migrate_use_compression
>    0.25%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
>    0.25%  live_migration   qemu-system-x86_64       [.] qemu_file_get_error
> 
> upstream
> 
> Migration status: completed
> total time: 72075 ms
> downtime: 345 ms
> setup: 287 ms
> transferred ram: 19601046 kbytes
> throughput: 2236.79 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 264134669 pages
> normal: 4301611 pages
> normal bytes: 17206444 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 17279539 kbytes
> pages-per-second: 2458584
> precopy ram: 2321505 kbytes
> downtime ram: 1 kbytes
> (qemu)
> 
>  39.09%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>   10.85%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    6.92%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    4.41%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.87%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.63%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    2.54%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    1.70%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.31%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.11%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    1.05%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.80%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.79%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.78%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.78%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.76%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.75%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.75%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.73%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.73%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.72%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.72%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.71%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.71%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.69%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.66%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.65%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.63%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.53%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.48%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
>    0.44%  live_migration   qemu-system-x86_64       [.] save_page_header
>    0.44%  live_migration   qemu-system-x86_64       [.] qemu_put_be64
>    0.39%  live_migration   qemu-system-x86_64       [.] migrate_postcopy_ram
>    0.36%  live_migration   qemu-system-x86_64       [.] runstate_is_running
>    0.33%  live_migration   qemu-system-x86_64       [.] get_ptr_rcu_reader
>    0.28%  live_migration   [kernel.kallsyms]        [k] __memcg_kmem_charge_page
>    0.27%  live_migration   qemu-system-x86_64       [.] migrate_use_compression
>    0.26%  live_migration   qemu-system-x86_64       [.] qemu_file_rate_limit
>    0.26%  live_migration   qemu-system-x86_64       [.] migrate_use_xbzrle
>    0.24%  live_migration   qemu-system-x86_64       [.] qemu_file_get_error
>    0.21%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
>    0.21%  live_migration   qemu-system-x86_64       [.] ram_transferred_add
>    0.20%  live_migration   [kernel.kallsyms]        [k] try_charge_memcg
>    0.19%  live_migration   qemu-system-x86_64       [.] ram_control_save_page
>    0.18%  live_migration   qemu-system-x86_64       [.] buffer_is_zero
>    0.18%  live_migration   qemu-system-x86_64       [.] cpu_physical_memory_set_dirty_lebitmap
>    0.12%  live_migration   qemu-system-x86_64       [.] qemu_ram_pagesize
>    0.11%  live_migration   [kernel.kallsyms]        [k] sync_regs
>    0.11%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    0.11%  live_migration   [kernel.kallsyms]        [k] clear_page_erms
>    0.11%  live_migration   [kernel.kallsyms]        [k] kernel_init_free_pages.part.0
>    0.11%  live_migration   qemu-system-x86_64       [.] migrate_background_snapshot
>    0.10%  live_migration   qemu-system-x86_64       [.] migrate_release_ram
>    0.10%  live_migration   [kernel.kallsyms]        [k] pte_alloc_one
>    0.10%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    0.10%  live_migration   [kernel.kallsyms]        [k] native_irq_return_iret
>    0.08%  live_migration   [kernel.kallsyms]        [k] kvm_clear_dirty_log_protect
>    0.07%  qemu-system-x86  [kernel.kallsyms]        [k] free_pcp_prepare
>    0.06%  qemu-system-x86  [kernel.kallsyms]        [k] __free_pages
>    0.06%  live_migration   [kernel.kallsyms]        [k] tdp_iter_next
>    0.05%  live_migration   qemu-system-x86_64       [.] cpu_physical_memory_sync_dirty_bitmap.con
>    0.05%  live_migration   [kernel.kallsyms]        [k] __list_del_entry_valid
>    0.05%  live_migration   [kernel.kallsyms]        [k] _raw_spin_lock_irqsave
>    0.05%  multifdsend_2    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.05%  multifdsend_11   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.05%  live_migration   [vdso]                   [.] 0x00000000000006f5
>    0.05%  multifdsend_15   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_1    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_13   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_4    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_8    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    0.04%  multifdsend_0    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_9    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_14   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  live_migration   [kernel.kallsyms]        [k] kvm_arch_mmu_enable_log_dirty_pt_masked
>    0.04%  live_migration   [kernel.kallsyms]        [k] obj_cgroup_charge_pages
>    0.04%  multifdsend_7    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_12   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_5    [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  multifdsend_10   [kernel.kallsyms]        [k] native_queued_spin_lock_slowpath.part.0
>    0.04%  live_migration   [kernel.kallsyms]        [k] _raw_spin_lock
>    0.04%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
> 
> 1TB idle, zero page
> 
> Migration status: completed
> total time: 52966 ms
> downtime: 409 ms
> setup: 273 ms
> transferred ram: 879229325 kbytes
> throughput: 136690.83 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 262093359 pages
> skipped: 0 pages
> normal: 4266123 pages
> normal bytes: 17064492 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 879229317 kbytes
> pages-per-second: 4024470
> precopy ram: 874888589 kbytes
> downtime ram: 4340735 kbytes
> 
>   14.42%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0          ◆
>    2.97%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common                  ▒
>    2.56%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable                  ▒
>    2.50%  live_migration   qemu-system-x86_64       [.] multifd_queue_page                      ▒
>    2.30%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic            ▒
>    1.17%  live_migration   qemu-system-x86_64       [.] find_next_bit                           ▒
>    1.12%  multifdsend_14   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.09%  live_migration   qemu-system-x86_64       [.] multifd_send_pages                      ▒
>    1.08%  multifdsend_15   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.07%  multifdsend_11   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.03%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.03%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.03%  multifdsend_7    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.03%  multifdsend_4    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.02%  multifdsend_2    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.02%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.02%  multifdsend_9    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.02%  multifdsend_8    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.01%  multifdsend_6    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    1.00%  multifdsend_5    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    0.99%  live_migration   libc.so.6                [.] __pthread_mutex_lock                    ▒
>    0.98%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    0.98%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    0.93%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared                   ▒
>    0.93%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512                      ▒
>    0.89%  live_migration   [kernel.kallsyms]        [k] futex_wake                              ▒
>    0.83%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt          ▒
>    0.70%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string          ▒
>    0.69%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
> 
> 1TB: stress  stress --vm 4 --vm-bytes 512M
> 
> Wait until load in guest reach 3 before doing the migration
> 
> upstream
> 
> Migration status: completed
> total time: 79912 ms
> downtime: 345 ms
> setup: 300 ms
> transferred ram: 23723877 kbytes
> throughput: 2441.21 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 263616778 pages
> normal: 5330059 pages
> normal bytes: 21320236 kbytes
> dirty sync count: 4
> page size: 4 kbytes
> multifd bytes: 21406921 kbytes
> pages-per-second: 2301580
> precopy ram: 2316947 kbytes
> downtime ram: 9 kbytes
> 
>   38.87%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    9.14%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    5.84%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    3.80%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.41%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.14%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    2.10%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    1.44%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.17%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    0.95%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    0.91%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.89%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.88%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.87%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.84%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.84%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.80%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.79%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.79%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.78%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.78%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.78%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.77%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.76%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.75%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.74%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.70%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.66%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.58%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.45%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
> 
> zero page
> 
> Migration status: completed
> total time: 58953 ms
> downtime: 373 ms
> setup: 348 ms
> transferred ram: 972143021 kbytes
> throughput: 135889.41 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 261357013 pages
> skipped: 0 pages
> normal: 5293916 pages
> normal bytes: 21175664 kbytes
> dirty sync count: 4
> page size: 4 kbytes
> multifd bytes: 972143012 kbytes
> pages-per-second: 3699692
> precopy ram: 968625243 kbytes
> downtime ram: 3517778 kbytes
> 
>  12.91%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    2.85%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.16%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    2.05%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    1.17%  live_migration   qemu-system-x86_64       [.] multifd_queue_page
>    1.13%  multifdsend_4    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.12%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.08%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.07%  multifdsend_14   qemu-system-x86_64       [.] buffer_zero_avx512
>    1.07%  multifdsend_15   qemu-system-x86_64       [.] buffer_zero_avx512
>    1.06%  multifdsend_2    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.06%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    1.06%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    1.04%  multifdsend_9    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.04%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.04%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.03%  multifdsend_11   qemu-system-x86_64       [.] buffer_zero_avx512
>    1.01%  multifdsend_5    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.99%  multifdsend_7    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.98%  multifdsend_6    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.98%  multifdsend_8    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.95%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.94%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.92%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.89%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    0.85%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.84%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.84%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.81%  live_migration   libc.so.6                [.] __pthread_mutex_lock
> 
> 1TB: stress  stress --vm 4 --vm-bytes 1024M
> 
> upstream
> 
> Migration status: completed
> total time: 79302 ms
> downtime: 315 ms
> setup: 307 ms
> transferred ram: 30307307 kbytes
> throughput: 3142.99 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 263089198 pages
> skipped: 0 pages
> normal: 6972933 pages
> normal bytes: 27891732 kbytes
> dirty sync count: 7
> page size: 4 kbytes
> multifd bytes: 27994987 kbytes
> pages-per-second: 1875902
> precopy ram: 2312314 kbytes
> downtime ram: 4 kbytes
> 
>   35.46%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    9.27%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    6.02%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    3.68%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.64%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    2.51%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.31%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    1.46%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.23%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.05%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.03%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.01%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.01%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.01%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.00%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.99%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.99%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.99%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.96%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    0.95%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.93%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.91%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.90%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.87%  live_migration   qemu-system-x86_64       [.] kvm_log_clear
>    0.87%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.82%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.82%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.65%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.58%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.47%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
> 
> zero_page
> 
> 900GB dirty + idle
> 
> mig_mon mm_dirty -m 10000 -p once
> 
> upstream
> 
> Migration status: completed
> total time: 104587 ms
> downtime: 381 ms
> setup: 311 ms
> transferred ram: 943318066 kbytes
> throughput: 74107.80 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 33298094 pages
> skipped: 0 pages
> normal: 235142522 pages
> normal bytes: 940570088 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 943025391 kbytes
> pages-per-second: 3331126
> precopy ram: 292673 kbytes
> downtime ram: 1 kbytes
> 
>   7.71%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    4.55%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.48%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.36%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.36%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.31%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.29%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.27%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.23%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.17%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.06%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.94%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.89%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.59%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.25%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.12%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    2.72%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.54%  live_migration   [kernel.kallsyms]        [k] copy_page
>    1.39%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    0.86%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.50%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.49%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    0.26%  multifdsend_7    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.25%  multifdsend_4    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.25%  multifdsend_10   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.25%  multifdsend_9    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.25%  multifdsend_15   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.24%  multifdsend_12   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.23%  multifdsend_5    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.23%  multifdsend_0    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.23%  multifdsend_3    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.21%  multifdsend_14   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.18%  live_migration   qemu-system-x86_64       [.] find_next_bit
> 
> Migration status: completed
> total time: 75601 ms
> downtime: 427 ms
> setup: 269 ms
> transferred ram: 1083999214 kbytes
> throughput: 117879.85 mbps
> remaining ram: 0 kbytes
> total ram: 1073742600 kbytes
> duplicate: 32991750 pages
> skipped: 0 pages
> normal: 232638485 pages
> normal bytes: 930553940 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 1083999202 kbytes
> pages-per-second: 3669333
> precopy ram: 1080197079 kbytes
> downtime ram: 3802134 kbytes
> 
>    4.41%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.38%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.37%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.32%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.29%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.29%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.28%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.27%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.16%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.09%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.07%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.07%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.07%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.07%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.07%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.07%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.59%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    1.59%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    1.39%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    0.80%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    0.65%  multifdsend_14   [kernel.kallsyms]        [k] copy_page
>    0.63%  multifdsend_1    [kernel.kallsyms]        [k] copy_page
>    0.58%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl
>    0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
>    0.40%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.29%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.26%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
> 
> 4TB idle
> 
> upstream
> 
> Migration status: completed
> total time: 317054 ms
> downtime: 552 ms
> setup: 1045 ms
> transferred ram: 77208692 kbytes
> throughput: 2001.52 mbps
> remaining ram: 0 kbytes
> total ram: 4294968072 kbytes
> duplicate: 1056844269 pages
> skipped: 0 pages
> normal: 16904683 pages
> normal bytes: 67618732 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 67919974 kbytes
> pages-per-second: 3477766
> precopy ram: 9288715 kbytes
> downtime ram: 2 kbytes
> 
>  44.27%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>   10.21%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    6.58%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    4.25%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.70%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.43%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    2.34%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    1.59%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.30%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.08%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    0.98%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.78%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.74%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.70%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.68%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.67%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.66%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.66%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.64%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.62%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.61%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.56%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.55%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.54%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.52%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.52%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.52%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.51%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.49%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.45%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
>    0.42%  live_migration   qemu-system-x86_64       [.] save_page_header
>    0.41%  live_migration   qemu-system-x86_64       [.] qemu_put_be64
>    0.35%  live_migration   qemu-system-x86_64       [.] migrate_postcopy_ram
> 
> zero_page
> 
> Migration status: completed
> total time: 215567 ms
> downtime: 500 ms
> setup: 1040 ms
> transferred ram: 3587151463 kbytes
> throughput: 136980.19 mbps
> remaining ram: 0 kbytes
> total ram: 4294968072 kbytes
> duplicate: 1048466740 pages
> skipped: 0 pages
> normal: 16747893 pages
> normal bytes: 66991572 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 3587151430 kbytes
> pages-per-second: 4104960
> precopy ram: 3583004863 kbytes
> downtime ram: 4146599 kbytes
> 
>  15.49%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    3.20%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.67%  live_migration   qemu-system-x86_64       [.] multifd_queue_page
>    2.33%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    2.19%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    1.19%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.18%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    1.14%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    1.02%  multifdsend_10   qemu-system-x86_64       [.] buffer_zero_avx512
>    1.01%  multifdsend_9    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.01%  multifdsend_8    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.00%  multifdsend_5    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.00%  multifdsend_3    qemu-system-x86_64       [.] buffer_zero_avx512
>    1.00%  multifdsend_15   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.99%  multifdsend_2    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.99%  multifdsend_6    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.99%  multifdsend_14   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.99%  multifdsend_0    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.98%  multifdsend_13   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.97%  multifdsend_1    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.97%  multifdsend_7    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.96%  live_migration   [kernel.kallsyms]        [k] futex_wake
>    0.96%  multifdsend_11   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.93%  multifdsend_4    qemu-system-x86_64       [.] buffer_zero_avx512
>    0.88%  multifdsend_12   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.81%  live_migration   [kernel.kallsyms]        [k] send_call_function_single_ipi
>    0.71%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.63%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
> 
> 4TB dirty + idle
> 
>     mig_mon mm_dirty -m 3900000 -p once
> 
> upstream
> 
> Migration status: completed
> total time: 357581 ms
> downtime: 553 ms
> setup: 1295 ms
> transferred ram: 4080035248 kbytes
> throughput: 93811.30 mbps
> remaining ram: 0 kbytes
> total ram: 4294968072 kbytes
> duplicate: 56507728 pages
> skipped: 0 pages
> normal: 1017239053 pages
> normal bytes: 4068956212 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 4079538545 kbytes
> pages-per-second: 3610116
> precopy ram: 496701 kbytes
> downtime ram: 2 kbytes
> 
>    5.07%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    4.99%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.99%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.97%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.96%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.95%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.91%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.65%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.56%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.33%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.16%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.83%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.79%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.75%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.73%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.58%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.95%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    0.88%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.36%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.32%  multifdsend_4    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.30%  multifdsend_5    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.30%  multifdsend_2    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.30%  multifdsend_0    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.30%  multifdsend_9    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.30%  multifdsend_7    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.30%  multifdsend_10   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.26%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    0.22%  multifdsend_8    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.22%  multifdsend_11   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.19%  multifdsend_13   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.19%  multifdsend_3    [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.17%  multifdsend_12   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.15%  multifdsend_14   [kernel.kallsyms]        [k] tcp_sendmsg_locked
>    0.14%  multifdsend_10   [kernel.kallsyms]        [k] _copy_from_iter
> 
> zero_page
> 
> Migration status: completed
> total time: 317428 ms
> downtime: 744 ms
> setup: 1192 ms
> transferred ram: 4340691359 kbytes
> throughput: 112444.34 mbps
> remaining ram: 0 kbytes
> total ram: 4294968072 kbytes
> duplicate: 55993692 pages
> normal: 1005801180 pages
> normal bytes: 4023204720 kbytes
> dirty sync count: 3
> page size: 4 kbytes
> multifd bytes: 4340691312 kbytes
> pages-per-second: 3417846
> precopy ram: 4336921795 kbytes
> downtime ram: 3769564 kbytes
> 
>   4.38%  multifdsend_5    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.38%  multifdsend_10   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.37%  multifdsend_11   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.34%  multifdsend_3    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.29%  multifdsend_4    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.28%  multifdsend_9    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.27%  multifdsend_12   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.26%  multifdsend_1    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.23%  multifdsend_13   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.18%  multifdsend_6    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    4.18%  multifdsend_2    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.90%  multifdsend_0    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.86%  multifdsend_14   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.84%  multifdsend_7    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.73%  multifdsend_8    [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    3.73%  multifdsend_15   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    1.59%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    1.45%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    1.28%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    1.02%  multifdsend_8    [kernel.kallsyms]        [k] copy_page
>    0.96%  multifdsend_15   [kernel.kallsyms]        [k] copy_page
>    0.83%  multifdsend_14   [kernel.kallsyms]        [k] copy_page
>    0.81%  multifdsend_7    [kernel.kallsyms]        [k] copy_page
>    0.75%  multifdsend_0    [kernel.kallsyms]        [k] copy_page
>    0.69%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
>    0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl
> 
> [v5]
> 
> In this version:
> - Rebase to latest
> - Address all comments
> - statistics about zero pages are right now (or at least much better than before)
> - changed how we calculate the amount of transferred ram
> - numbers, who don't like numbers.
> 
> Everything has been checked with a guest launched like the following
> command.  Migration is running through localhost.  Will send numbers
> with real hardware as soon as I get access to the machines that have
> it (I checked with previous versions already, but not this one).
> 
> [removed example]
> 
> Please review, Juan.
> 
> [v4]
> In this version
> - Rebase to latest
> - Address all comments from previous versions
> - code cleanup
> 
> Please review.
> 
> [v2]
> This is a rebase against last master.
> 
> And the reason for resend is to configure properly git-publish and
> hope this time that git-publish send all the patches.
> 
> Please, review.
> 
> [v1]
> Since Friday version:
> - More cleanups on the code
> - Remove repeated calls to qemu_target_page_size()
> - Establish normal pages and zero pages
> - detect zero pages on the multifd threads
> - send zero pages through the multifd channels.
> - reviews by Richard addressed.
> 
> It pases migration-test, so it should be perfect O:+)
> 
> ToDo for next version:
> - check the version changes
>   I need that 6.2 is out to check for 7.0.
>   This code don't exist at all due to that reason.
> - Send measurements of the differences
> 
> Please, review.
> 
> [
> 
> Friday version that just created a single writev instead of
> write+writev.
> 
> ]
> 
> Right now, multifd does a write() for the header and a writev() for
> each group of pages.  Simplify it so we send the header as another
> member of the IOV.
> 
> Once there, I got several simplifications:
> * is_zero_range() was used only once, just use its body.
> * same with is_zero_page().
> * Be consintent and use offset insed the ramblock everywhere.
> * Now that we have the offsets of the ramblock, we can drop the iov.
> * Now that nothing uses iov's except NOCOMP method, move the iovs
>   from pages to methods.
> * Now we can use iov's with a single field for zlib/zstd.
> * send_write() method is the same in all the implementaitons, so use
>   it directly.
> * Now, we can use a single writev() to write everything.
> 
> ToDo: Move zero page detection to the multifd thrteads.
> 
> With RAM in the Terabytes size, the detection of the zero page takes
> too much time on the main thread.
> 
> Last patch on the series removes the detection of zero pages in the
> main thread for multifd.  In the next series post, I will add how to
> detect the zero pages and send them on multifd channels.
> 
> Please review.
> 
> Later, Juan.
> 
> Juan Quintela (13):
>   multifd: Document the locking of MultiFD{Send/Recv}Params
>   multifd: Create page_size fields into both MultiFD{Recv,Send}Params
>   multifd: Create page_count fields into both MultiFD{Recv,Send}Params
>   migration: Export ram_transferred_ram()
>   multifd: Count the number of bytes sent correctly
>   migration: Make ram_save_target_page() a pointer
>   multifd: Make flags field thread local
>   multifd: Prepare to send a packet without the mutex held
>   multifd: Add property to enable/disable zero_page
>   migration: Export ram_release_page()
>   multifd: Support for zero pages transmission
>   multifd: Zero pages transmission
>   migration: Use multifd before we check for the zero page
> 
>  migration/migration.h    |   3 +
>  migration/multifd.h      | 118 +++++++++++++++++++++++----------
>  migration/ram.h          |   3 +
>  hw/core/machine.c        |   4 +-
>  migration/migration.c    |  11 +++
>  migration/multifd-zlib.c |  12 ++--
>  migration/multifd-zstd.c |  12 ++--
>  migration/multifd.c      | 140 ++++++++++++++++++++++++++++-----------
>  migration/ram.c          |  48 +++++++++++---
>  migration/trace-events   |   8 +--
>  10 files changed, 258 insertions(+), 101 deletions(-)
> 
> -- 
> 2.35.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads
  2022-05-12 13:40 ` [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Dr. David Alan Gilbert
@ 2022-05-16 10:45   ` Juan Quintela
  0 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-16 10:45 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:

>> 16GB guest
>>                 Precopy            upstream          zero page
>>                 Time    Downtime   Time    Downtime  Time    Downtime
>> clean idle      1548     93         1359   48         866    167

>                                            866/1359 = 64%


>> dirty idle     16222    220         2092   371       1870    258

>                                            1870/2092 = 89%

>> busy 4GB       don't converge      31000   308       1604    371
>> 
>> In the dirty idle, there is some weirdness in the precopy case, I
>> tried several times and it always took too much time.  It should be
>> faster.
>> 
>> In the busy 4GB case, precopy don't converge (expected) and without
>> zero page, multifd is on the limit, it _almost_ don't convrge, it took
>> 187 iterations to converge.
>> 
>> 1TB
>>                 Precopy            upstream          zero page
>>                 Time    Downtime   Time    Downtime  Time    Downtime
>> clean idle     83174    381        72075   345       52966   273

>                                           52966/72075=74%

>> dirty idle                        104587   381       75601   269

>                                           75601/104587=72%

>> busy 2GB                           79912   345       58953   348
>> 
>> I only tried the clean idle case with 1TB.  Notice that it is already
>> significantively slower.  With 1TB RAM, zero page is clearly superior in all tests.
>> 
>> 4TB
>>                 upstream          zero page
>>                 Time    Downtime  Time    Downtime
>> clean idle      317054  552       215567  500

>                 215567/317054 = 68%

>> dirty idle      357581  553       317428  744

>                 317428/357581 = 89%

>
> The 1TB dirty/idle is a bit of an unusual outlier at 72% time; but still
> the 89% on the 16GB/4TB dirty case is still a useful improvement - I wasn't
> expecting the dirty case to be as good - I wonder if there's some side
> benefit, like meaning the page is only read by the data threads and not
> also read by the main thread so only in one cache?

That could help it, but  Ithink that it is much simpler than that:

live_migration thread with upstream

>    5.07%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>    0.95%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    0.88%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    0.36%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    0.26%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable

Almost 8% CPU.

live migration with zero page:

>    1.59%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    1.45%  live_migration   libc.so.6                [.] __pthread_mutex_unlock_usercnt
>    1.28%  live_migration   libc.so.6                [.] __pthread_mutex_lock
>    0.69%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_unlock_impl
>    0.48%  live_migration   qemu-system-x86_64       [.] qemu_mutex_lock_impl

less than 6% CPU, and remember, we are going way faster, so we are doing
much more work here.  I *think* that it as much related that we are
waiting less time for the migration thread.  Remember that at this
point, we are already limited by the network.

I think that for explaining it, it is much better the zero page case, we
move from upstream:

>  44.27%  live_migration   qemu-system-x86_64       [.] buffer_zero_avx512
>   10.21%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    6.58%  live_migration   qemu-system-x86_64       [.] add_to_iovec
>    4.25%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.70%  live_migration   qemu-system-x86_64       [.] qemu_put_byte.part.0
>    2.43%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    2.34%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    1.59%  live_migration   qemu-system-x86_64       [.] qemu_put_be32
>    1.30%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.08%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    0.98%  live_migration   qemu-system-x86_64       [.] ram_save_iterate
>    0.67%  live_migration   [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>    0.61%  live_migration   qemu-system-x86_64       [.] save_zero_page_to_file.part.0
>    0.45%  live_migration   qemu-system-x86_64       [.] qemu_put_byte
>    0.42%  live_migration   qemu-system-x86_64       [.] save_page_header
>    0.41%  live_migration   qemu-system-x86_64       [.] qemu_put_be64
>    0.35%  live_migration   qemu-system-x86_64       [.] migrate_postcopy_ram

More than 80% (I am too lazy to do the sum), to zero page detection
with:

>  15.49%  live_migration   qemu-system-x86_64       [.] ram_find_and_save_block.part.0
>    3.20%  live_migration   qemu-system-x86_64       [.] ram_bytes_total_common
>    2.67%  live_migration   qemu-system-x86_64       [.] multifd_queue_page
>    2.33%  live_migration   qemu-system-x86_64       [.] bitmap_test_and_clear_atomic
>    2.19%  live_migration   qemu-system-x86_64       [.] qemu_ram_is_migratable
>    1.19%  live_migration   qemu-system-x86_64       [.] find_next_bit
>    1.18%  live_migration   qemu-system-x86_64       [.] migrate_ignore_shared
>    1.14%  live_migration   qemu-system-x86_64       [.] multifd_send_pages
>    0.96%  live_migration   [kernel.kallsyms]        [k] futex_wake
>    0.81%  live_migration   [kernel.kallsyms]        [k] send_call_function_single_ipi
>    0.71%  live_migration   qemu-system-x86_64       [.] ram_save_iterate

almost 32% (also lazy to do the sum).

> (the 10% improvement on the dirty case is more important to me than the
> more impressive number for the clean case)

Fully agree.  Getting this series to go faster with huge guests (1TB/4TB
guests) was relatively easy.  Being sure that we didn't hurt the smaller
guests was more complicated.  The other added benefit is that we don't
sent any page for RAM through the migration channel, that makes things
much better because we use way less overhead.

Later, Juan.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params
  2022-05-10 22:42 ` [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params Juan Quintela
@ 2022-05-16 13:14   ` Dr. David Alan Gilbert
  2022-05-18  8:40     ` Juan Quintela
  0 siblings, 1 reply; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2022-05-16 13:14 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

* Juan Quintela (quintela@redhat.com) wrote:
> Reorder the structures so we can know if the fields are:
> - Read only
> - Their own locking (i.e. sems)
> - Protected by 'mutex'
> - Only for the multifd channel
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/multifd.h | 86 +++++++++++++++++++++++++++------------------
>  1 file changed, 51 insertions(+), 35 deletions(-)
> 
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 7d0effcb03..f1f88c6737 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -65,7 +65,9 @@ typedef struct {
>  } MultiFDPages_t;
>  
>  typedef struct {
> -    /* this fields are not changed once the thread is created */
> +    /* Fiields are only written at creating/deletion time */
> +    /* No lock required for them, they are read only */
> +
>      /* channel number */
>      uint8_t id;
>      /* channel thread name */
> @@ -74,37 +76,45 @@ typedef struct {
>      QemuThread thread;
>      /* communication channel */
>      QIOChannel *c;
> -    /* sem where to wait for more work */
> -    QemuSemaphore sem;
> -    /* this mutex protects the following parameters */
> -    QemuMutex mutex;
> -    /* is this channel thread running */
> -    bool running;
> -    /* should this thread finish */
> -    bool quit;
>      /* is the yank function registered */
>      bool registered_yank;
> +    /* packet allocated len */
> +    uint32_t packet_len;
> +
> +    /* sem where to wait for more work */
> +    QemuSemaphore sem;
> +    /* syncs main thread and channels */
> +    QemuSemaphore sem_sync;
> +
> +    /* this mutex protects the following parameters */
> +    QemuMutex mutex;
> +    /* is this channel thread running */
> +    bool running;
> +    /* should this thread finish */
> +    bool quit;
> +    /* multifd flags for each packet */
> +    uint32_t flags;
> +    /* global number of generated multifd packets */
> +    uint64_t packet_num;

Is there a way to explain why packet_num, being global, is inside
SendParams?  I understand why num_packets is - because
that's per channel; so why is a global isnide the params
(and having two things with almost the same name is very confusing!)

Dave

>      /* thread has work to do */
>      int pending_job;
> -    /* array of pages to sent */
> +    /* array of pages to sent.
> +     * The owner of 'pages' depends of 'pending_job' value:
> +     * pending_job == 0 -> migration_thread can use it.
> +     * pending_job != 0 -> multifd_channel can use it.
> +     */
>      MultiFDPages_t *pages;
> -    /* packet allocated len */
> -    uint32_t packet_len;
> +
> +    /* thread local variables. No locking required */
> +
>      /* pointer to the packet */
>      MultiFDPacket_t *packet;
> -    /* multifd flags for each packet */
> -    uint32_t flags;
>      /* size of the next packet that contains pages */
>      uint32_t next_packet_size;
> -    /* global number of generated multifd packets */
> -    uint64_t packet_num;
> -    /* thread local variables */
>      /* packets sent through this channel */
>      uint64_t num_packets;
>      /* non zero pages sent through this channel */
>      uint64_t total_normal_pages;
> -    /* syncs main thread and channels */
> -    QemuSemaphore sem_sync;
>      /* buffers to send */
>      struct iovec *iov;
>      /* number of iovs used */
> @@ -118,7 +128,9 @@ typedef struct {
>  }  MultiFDSendParams;
>  
>  typedef struct {
> -    /* this fields are not changed once the thread is created */
> +    /* Fiields are only written at creating/deletion time */
> +    /* No lock required for them, they are read only */
> +
>      /* channel number */
>      uint8_t id;
>      /* channel thread name */
> @@ -127,31 +139,35 @@ typedef struct {
>      QemuThread thread;
>      /* communication channel */
>      QIOChannel *c;
> +    /* packet allocated len */
> +    uint32_t packet_len;
> +
> +    /* syncs main thread and channels */
> +    QemuSemaphore sem_sync;
> +
>      /* this mutex protects the following parameters */
>      QemuMutex mutex;
>      /* is this channel thread running */
>      bool running;
>      /* should this thread finish */
>      bool quit;
> +    /* multifd flags for each packet */
> +    uint32_t flags;
> +    /* global number of generated multifd packets */
> +    uint64_t packet_num;
> +
> +    /* thread local variables. No locking required */
> +
> +    /* pointer to the packet */
> +    MultiFDPacket_t *packet;
> +    /* size of the next packet that contains pages */
> +    uint32_t next_packet_size;
> +    /* packets sent through this channel */
> +    uint64_t num_packets;
>      /* ramblock host address */
>      uint8_t *host;
> -    /* packet allocated len */
> -    uint32_t packet_len;
> -    /* pointer to the packet */
> -    MultiFDPacket_t *packet;
> -    /* multifd flags for each packet */
> -    uint32_t flags;
> -    /* global number of generated multifd packets */
> -    uint64_t packet_num;
> -    /* thread local variables */
> -    /* size of the next packet that contains pages */
> -    uint32_t next_packet_size;
> -    /* packets sent through this channel */
> -    uint64_t num_packets;
>      /* non zero pages recv through this channel */
>      uint64_t total_normal_pages;
> -    /* syncs main thread and channels */
> -    QemuSemaphore sem_sync;
>      /* buffers to recv */
>      struct iovec *iov;
>      /* Pages that are not zero */
> -- 
> 2.35.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv,Send}Params
  2022-05-10 22:42 ` [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv, Send}Params Juan Quintela
@ 2022-05-17  8:44   ` Dr. David Alan Gilbert
  2022-05-18  8:48     ` Juan Quintela
  0 siblings, 1 reply; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2022-05-17  8:44 UTC (permalink / raw)
  To: Juan Quintela, peter.maydell
  Cc: qemu-devel, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

* Juan Quintela (quintela@redhat.com) wrote:
> We were calling qemu_target_page_size() left and right.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>

(Copying in Peter Maydell)
Your problem here is most of these files are target independent
so you end up calling the qemu_target_page_size functions, which I guess
you're seeing popup in some perf trace?
I mean they're trivial functions but I guess you do get the function
call.

I wonder about the following patch instead
(Note i've removed the const on the structure here); I wonder how this
does performance wise for everyone:


From abc7da46736b18b6138868ccc0b11901169e1dfd Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Date: Mon, 16 May 2022 19:54:31 +0100
Subject: [PATCH] target-page: Maintain target_page variable even for
 non-variable
Content-type: text/plain

On architectures that define TARGET_PAGE_BITS_VARY, the 'target_page'
structure gets filled in at run time by the number of bits and the
TARGET_PAGE_BITS and TARGET_PAGE macros use that rather than being
constant.

On non-variable pagesize systems target_page is not filled in, and we
rely on TARGET_PAGE_SIZE being compile time defined.

The problem is that for source files that are target-independent
they end up calling qemu_target_page_size to read the size, and that
function call is annoying.

Improve this by always filling in 'target_page' even for non-variable
size CPUs, and inlining the functions that previously returned
the macro values (that may have been constant) to return the
values read from target_page.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/exec/cpu-all.h     |  4 ++--
 include/exec/page-vary.h   |  2 ++
 include/exec/target_page.h | 11 +++++++++--
 page-vary.c                |  2 --
 softmmu/physmem.c          | 10 ----------
 5 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 5d5290deb5..6a498fa033 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -214,9 +214,9 @@ static inline void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val
 
 /* page related stuff */
 
+#include "exec/page-vary.h"
+
 #ifdef TARGET_PAGE_BITS_VARY
-# include "exec/page-vary.h"
-extern const TargetPageBits target_page;
 #ifdef CONFIG_DEBUG_TCG
 #define TARGET_PAGE_BITS   ({ assert(target_page.decided); target_page.bits; })
 #define TARGET_PAGE_MASK   ({ assert(target_page.decided); \
diff --git a/include/exec/page-vary.h b/include/exec/page-vary.h
index ebbe9b169b..31cb9dd9dd 100644
--- a/include/exec/page-vary.h
+++ b/include/exec/page-vary.h
@@ -26,6 +26,8 @@ typedef struct {
     uint64_t mask;
 } TargetPageBits;
 
+extern TargetPageBits target_page;
+
 #ifdef IN_PAGE_VARY
 extern bool set_preferred_target_page_bits_common(int bits);
 extern void finalize_target_page_bits_common(int min);
diff --git a/include/exec/target_page.h b/include/exec/target_page.h
index 96726c36a4..e718b145b3 100644
--- a/include/exec/target_page.h
+++ b/include/exec/target_page.h
@@ -13,9 +13,16 @@
 
 #ifndef EXEC_TARGET_PAGE_H
 #define EXEC_TARGET_PAGE_H
+#include "exec/page-vary.h"
+
+inline int qemu_target_page_bits(void) {
+    return target_page.bits;
+}
+
+inline size_t qemu_target_page_size(void) {
+    return 1 << target_page.bits;
+}
 
-size_t qemu_target_page_size(void);
-int qemu_target_page_bits(void);
 int qemu_target_page_bits_min(void);
 
 #endif
diff --git a/page-vary.c b/page-vary.c
index 343b4adb95..3f81144cda 100644
--- a/page-vary.c
+++ b/page-vary.c
@@ -35,7 +35,5 @@ bool set_preferred_target_page_bits(int bits)
 
 void finalize_target_page_bits(void)
 {
-#ifdef TARGET_PAGE_BITS_VARY
     finalize_target_page_bits_common(TARGET_PAGE_BITS_MIN);
-#endif
 }
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 657841eed0..2117476081 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3515,16 +3515,6 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
  * Allows code that needs to deal with migration bitmaps etc to still be built
  * target independent.
  */
-size_t qemu_target_page_size(void)
-{
-    return TARGET_PAGE_SIZE;
-}
-
-int qemu_target_page_bits(void)
-{
-    return TARGET_PAGE_BITS;
-}
-
 int qemu_target_page_bits_min(void)
 {
     return TARGET_PAGE_BITS_MIN;
-- 
2.36.1

> ---
>  migration/multifd.h      |  4 ++++
>  migration/multifd-zlib.c | 12 +++++-------
>  migration/multifd-zstd.c | 12 +++++-------
>  migration/multifd.c      | 18 ++++++++----------
>  4 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/migration/multifd.h b/migration/multifd.h
> index f1f88c6737..4de80d9e53 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -80,6 +80,8 @@ typedef struct {
>      bool registered_yank;
>      /* packet allocated len */
>      uint32_t packet_len;
> +    /* guest page size */
> +    uint32_t page_size;
>  
>      /* sem where to wait for more work */
>      QemuSemaphore sem;
> @@ -141,6 +143,8 @@ typedef struct {
>      QIOChannel *c;
>      /* packet allocated len */
>      uint32_t packet_len;
> +    /* guest page size */
> +    uint32_t page_size;
>  
>      /* syncs main thread and channels */
>      QemuSemaphore sem_sync;
> diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
> index 3a7ae44485..28349ff2e0 100644
> --- a/migration/multifd-zlib.c
> +++ b/migration/multifd-zlib.c
> @@ -100,7 +100,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp)
>  static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
>  {
>      struct zlib_data *z = p->data;
> -    size_t page_size = qemu_target_page_size();
>      z_stream *zs = &z->zs;
>      uint32_t out_size = 0;
>      int ret;
> @@ -114,7 +113,7 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
>              flush = Z_SYNC_FLUSH;
>          }
>  
> -        zs->avail_in = page_size;
> +        zs->avail_in = p->page_size;
>          zs->next_in = p->pages->block->host + p->normal[i];
>  
>          zs->avail_out = available;
> @@ -220,12 +219,11 @@ static void zlib_recv_cleanup(MultiFDRecvParams *p)
>  static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
>  {
>      struct zlib_data *z = p->data;
> -    size_t page_size = qemu_target_page_size();
>      z_stream *zs = &z->zs;
>      uint32_t in_size = p->next_packet_size;
>      /* we measure the change of total_out */
>      uint32_t out_size = zs->total_out;
> -    uint32_t expected_size = p->normal_num * page_size;
> +    uint32_t expected_size = p->normal_num * p->page_size;
>      uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
>      int ret;
>      int i;
> @@ -252,7 +250,7 @@ static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
>              flush = Z_SYNC_FLUSH;
>          }
>  
> -        zs->avail_out = page_size;
> +        zs->avail_out = p->page_size;
>          zs->next_out = p->host + p->normal[i];
>  
>          /*
> @@ -266,8 +264,8 @@ static int zlib_recv_pages(MultiFDRecvParams *p, Error **errp)
>          do {
>              ret = inflate(zs, flush);
>          } while (ret == Z_OK && zs->avail_in
> -                             && (zs->total_out - start) < page_size);
> -        if (ret == Z_OK && (zs->total_out - start) < page_size) {
> +                             && (zs->total_out - start) < p->page_size);
> +        if (ret == Z_OK && (zs->total_out - start) < p->page_size) {
>              error_setg(errp, "multifd %u: inflate generated too few output",
>                         p->id);
>              return -1;
> diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c
> index d788d309f2..f4a8e1ed1f 100644
> --- a/migration/multifd-zstd.c
> +++ b/migration/multifd-zstd.c
> @@ -113,7 +113,6 @@ static void zstd_send_cleanup(MultiFDSendParams *p, Error **errp)
>  static int zstd_send_prepare(MultiFDSendParams *p, Error **errp)
>  {
>      struct zstd_data *z = p->data;
> -    size_t page_size = qemu_target_page_size();
>      int ret;
>      uint32_t i;
>  
> @@ -128,7 +127,7 @@ static int zstd_send_prepare(MultiFDSendParams *p, Error **errp)
>              flush = ZSTD_e_flush;
>          }
>          z->in.src = p->pages->block->host + p->normal[i];
> -        z->in.size = page_size;
> +        z->in.size = p->page_size;
>          z->in.pos = 0;
>  
>          /*
> @@ -241,8 +240,7 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
>  {
>      uint32_t in_size = p->next_packet_size;
>      uint32_t out_size = 0;
> -    size_t page_size = qemu_target_page_size();
> -    uint32_t expected_size = p->normal_num * page_size;
> +    uint32_t expected_size = p->normal_num * p->page_size;
>      uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
>      struct zstd_data *z = p->data;
>      int ret;
> @@ -265,7 +263,7 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
>  
>      for (i = 0; i < p->normal_num; i++) {
>          z->out.dst = p->host + p->normal[i];
> -        z->out.size = page_size;
> +        z->out.size = p->page_size;
>          z->out.pos = 0;
>  
>          /*
> @@ -279,8 +277,8 @@ static int zstd_recv_pages(MultiFDRecvParams *p, Error **errp)
>          do {
>              ret = ZSTD_decompressStream(z->zds, &z->out, &z->in);
>          } while (ret > 0 && (z->in.size - z->in.pos > 0)
> -                         && (z->out.pos < page_size));
> -        if (ret > 0 && (z->out.pos < page_size)) {
> +                         && (z->out.pos < p->page_size));
> +        if (ret > 0 && (z->out.pos < p->page_size)) {
>              error_setg(errp, "multifd %u: decompressStream buffer too small",
>                         p->id);
>              return -1;
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 9ea4f581e2..f15fed5f1f 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -87,15 +87,14 @@ static void nocomp_send_cleanup(MultiFDSendParams *p, Error **errp)
>  static int nocomp_send_prepare(MultiFDSendParams *p, Error **errp)
>  {
>      MultiFDPages_t *pages = p->pages;
> -    size_t page_size = qemu_target_page_size();
>  
>      for (int i = 0; i < p->normal_num; i++) {
>          p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
> -        p->iov[p->iovs_num].iov_len = page_size;
> +        p->iov[p->iovs_num].iov_len = p->page_size;
>          p->iovs_num++;
>      }
>  
> -    p->next_packet_size = p->normal_num * page_size;
> +    p->next_packet_size = p->normal_num * p->page_size;
>      p->flags |= MULTIFD_FLAG_NOCOMP;
>      return 0;
>  }
> @@ -139,7 +138,6 @@ static void nocomp_recv_cleanup(MultiFDRecvParams *p)
>  static int nocomp_recv_pages(MultiFDRecvParams *p, Error **errp)
>  {
>      uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> -    size_t page_size = qemu_target_page_size();
>  
>      if (flags != MULTIFD_FLAG_NOCOMP) {
>          error_setg(errp, "multifd %u: flags received %x flags expected %x",
> @@ -148,7 +146,7 @@ static int nocomp_recv_pages(MultiFDRecvParams *p, Error **errp)
>      }
>      for (int i = 0; i < p->normal_num; i++) {
>          p->iov[i].iov_base = p->host + p->normal[i];
> -        p->iov[i].iov_len = page_size;
> +        p->iov[i].iov_len = p->page_size;
>      }
>      return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
>  }
> @@ -281,8 +279,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
>  static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
>  {
>      MultiFDPacket_t *packet = p->packet;
> -    size_t page_size = qemu_target_page_size();
> -    uint32_t page_count = MULTIFD_PACKET_SIZE / page_size;
> +    uint32_t page_count = MULTIFD_PACKET_SIZE / p->page_size;
>      RAMBlock *block;
>      int i;
>  
> @@ -344,7 +341,7 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
>      for (i = 0; i < p->normal_num; i++) {
>          uint64_t offset = be64_to_cpu(packet->offset[i]);
>  
> -        if (offset > (block->used_length - page_size)) {
> +        if (offset > (block->used_length - p->page_size)) {
>              error_setg(errp, "multifd: offset too long %" PRIu64
>                         " (max " RAM_ADDR_FMT ")",
>                         offset, block->used_length);
> @@ -433,8 +430,7 @@ static int multifd_send_pages(QEMUFile *f)
>      p->packet_num = multifd_send_state->packet_num++;
>      multifd_send_state->pages = p->pages;
>      p->pages = pages;
> -    transferred = ((uint64_t) pages->num) * qemu_target_page_size()
> -                + p->packet_len;
> +    transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len;
>      qemu_file_update_transfer(f, transferred);
>      ram_counters.multifd_bytes += transferred;
>      ram_counters.transferred += transferred;
> @@ -898,6 +894,7 @@ int multifd_save_setup(Error **errp)
>          /* We need one extra place for the packet header */
>          p->iov = g_new0(struct iovec, page_count + 1);
>          p->normal = g_new0(ram_addr_t, page_count);
> +        p->page_size = qemu_target_page_size();
>          socket_send_channel_create(multifd_new_send_channel_async, p);
>      }
>  
> @@ -1138,6 +1135,7 @@ int multifd_load_setup(Error **errp)
>          p->name = g_strdup_printf("multifdrecv_%d", i);
>          p->iov = g_new0(struct iovec, page_count);
>          p->normal = g_new0(ram_addr_t, page_count);
> +        p->page_size = qemu_target_page_size();
>      }
>  
>      for (i = 0; i < thread_count; i++) {
> -- 
> 2.35.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params
  2022-05-16 13:14   ` Dr. David Alan Gilbert
@ 2022-05-18  8:40     ` Juan Quintela
  0 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-18  8:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Reorder the structures so we can know if the fields are:
>> - Read only
>> - Their own locking (i.e. sems)
>> - Protected by 'mutex'
>> - Only for the multifd channel
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  migration/multifd.h | 86 +++++++++++++++++++++++++++------------------
>>  1 file changed, 51 insertions(+), 35 deletions(-)
>> 
>> diff --git a/migration/multifd.h b/migration/multifd.h
>> index 7d0effcb03..f1f88c6737 100644
>> --- a/migration/multifd.h
>> +++ b/migration/multifd.h
>> @@ -65,7 +65,9 @@ typedef struct {
>>  } MultiFDPages_t;
>>  
>>  typedef struct {
>> -    /* this fields are not changed once the thread is created */
>> +    /* Fiields are only written at creating/deletion time */
>> +    /* No lock required for them, they are read only */
>> +
>>      /* channel number */
>>      uint8_t id;
>>      /* channel thread name */
>> @@ -74,37 +76,45 @@ typedef struct {
>>      QemuThread thread;
>>      /* communication channel */
>>      QIOChannel *c;
>> -    /* sem where to wait for more work */
>> -    QemuSemaphore sem;
>> -    /* this mutex protects the following parameters */
>> -    QemuMutex mutex;
>> -    /* is this channel thread running */
>> -    bool running;
>> -    /* should this thread finish */
>> -    bool quit;
>>      /* is the yank function registered */
>>      bool registered_yank;
>> +    /* packet allocated len */
>> +    uint32_t packet_len;
>> +
>> +    /* sem where to wait for more work */
>> +    QemuSemaphore sem;
>> +    /* syncs main thread and channels */
>> +    QemuSemaphore sem_sync;
>> +
>> +    /* this mutex protects the following parameters */
>> +    QemuMutex mutex;
>> +    /* is this channel thread running */
>> +    bool running;
>> +    /* should this thread finish */
>> +    bool quit;
>> +    /* multifd flags for each packet */
>> +    uint32_t flags;
>> +    /* global number of generated multifd packets */
>> +    uint64_t packet_num;
>
> Is there a way to explain why packet_num, being global, is inside
> SendParams?  I understand why num_packets is - because
> that's per channel; so why is a global isnide the params
> (and having two things with almost the same name is very confusing!)

Ok, I will try to improve the documentation (it was already there).

Each packet that we sent (independently of what channel we sent it
through) has a packet number, that is unique for all channels (i.e. not
only for a single channel).  The number is assigned in

multifd_send_pages(), and the "global" value is stored in
multifd_send_state.

This field is _where_ we "transport" it to the real packet.

We have that field in:

- multifd_send_state, where we copy the current value to
- Multifd_send_params, where we copy that value to
- Multifd_packet.

Notice that the only place where we change the value is
multifd_send_state, once that we put a value on the multifd_send_params,
it is a constant until the next packet.

So how about:

/* assigned global packet number for this packet */

??

I am open to better names.

Later, Juan.





> Dave
>
>>      /* thread has work to do */
>>      int pending_job;
>> -    /* array of pages to sent */
>> +    /* array of pages to sent.
>> +     * The owner of 'pages' depends of 'pending_job' value:
>> +     * pending_job == 0 -> migration_thread can use it.
>> +     * pending_job != 0 -> multifd_channel can use it.
>> +     */
>>      MultiFDPages_t *pages;
>> -    /* packet allocated len */
>> -    uint32_t packet_len;
>> +
>> +    /* thread local variables. No locking required */
>> +
>>      /* pointer to the packet */
>>      MultiFDPacket_t *packet;
>> -    /* multifd flags for each packet */
>> -    uint32_t flags;
>>      /* size of the next packet that contains pages */
>>      uint32_t next_packet_size;
>> -    /* global number of generated multifd packets */
>> -    uint64_t packet_num;
>> -    /* thread local variables */
>>      /* packets sent through this channel */
>>      uint64_t num_packets;
>>      /* non zero pages sent through this channel */
>>      uint64_t total_normal_pages;
>> -    /* syncs main thread and channels */
>> -    QemuSemaphore sem_sync;
>>      /* buffers to send */
>>      struct iovec *iov;
>>      /* number of iovs used */
>> @@ -118,7 +128,9 @@ typedef struct {
>>  }  MultiFDSendParams;
>>  
>>  typedef struct {
>> -    /* this fields are not changed once the thread is created */
>> +    /* Fiields are only written at creating/deletion time */
>> +    /* No lock required for them, they are read only */
>> +
>>      /* channel number */
>>      uint8_t id;
>>      /* channel thread name */
>> @@ -127,31 +139,35 @@ typedef struct {
>>      QemuThread thread;
>>      /* communication channel */
>>      QIOChannel *c;
>> +    /* packet allocated len */
>> +    uint32_t packet_len;
>> +
>> +    /* syncs main thread and channels */
>> +    QemuSemaphore sem_sync;
>> +
>>      /* this mutex protects the following parameters */
>>      QemuMutex mutex;
>>      /* is this channel thread running */
>>      bool running;
>>      /* should this thread finish */
>>      bool quit;
>> +    /* multifd flags for each packet */
>> +    uint32_t flags;
>> +    /* global number of generated multifd packets */
>> +    uint64_t packet_num;
>> +
>> +    /* thread local variables. No locking required */
>> +
>> +    /* pointer to the packet */
>> +    MultiFDPacket_t *packet;
>> +    /* size of the next packet that contains pages */
>> +    uint32_t next_packet_size;
>> +    /* packets sent through this channel */
>> +    uint64_t num_packets;
>>      /* ramblock host address */
>>      uint8_t *host;
>> -    /* packet allocated len */
>> -    uint32_t packet_len;
>> -    /* pointer to the packet */
>> -    MultiFDPacket_t *packet;
>> -    /* multifd flags for each packet */
>> -    uint32_t flags;
>> -    /* global number of generated multifd packets */
>> -    uint64_t packet_num;
>> -    /* thread local variables */
>> -    /* size of the next packet that contains pages */
>> -    uint32_t next_packet_size;
>> -    /* packets sent through this channel */
>> -    uint64_t num_packets;
>>      /* non zero pages recv through this channel */
>>      uint64_t total_normal_pages;
>> -    /* syncs main thread and channels */
>> -    QemuSemaphore sem_sync;
>>      /* buffers to recv */
>>      struct iovec *iov;
>>      /* Pages that are not zero */
>> -- 
>> 2.35.1
>> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv,Send}Params
  2022-05-17  8:44   ` [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv,Send}Params Dr. David Alan Gilbert
@ 2022-05-18  8:48     ` Juan Quintela
  0 siblings, 0 replies; 20+ messages in thread
From: Juan Quintela @ 2022-05-18  8:48 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: peter.maydell, qemu-devel, Eduardo Habkost, Peter Xu,
	Philippe Mathieu-Daudé,
	Yanan Wang, Leonardo Bras, Marcel Apfelbaum, Richard Henderson

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> We were calling qemu_target_page_size() left and right.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>

[Adding Richard]

> (Copying in Peter Maydell)
> Your problem here is most of these files are target independent
> so you end up calling the qemu_target_page_size functions, which I guess
> you're seeing popup in some perf trace?
> I mean they're trivial functions but I guess you do get the function
> call.

Hi

There are several problems here:

- Richard complained in previous reviews that we were calling
  qemu_target_page_size() inside loops or more than once per function
  (He was right)

- qemu_target_page_size() name is so long that basically means that I
  had to split the line for each appearance.

- All migration code assumes that the value is constant for a current
  migration, it can change.

So I decided to cache the value in the structure and call it a day.  The
same for the other page_count field.

I have never seen that function on a performance profile, so this is
just a taste/aesthetic issue.

I think your patch is still good, but it don't cover any of the issues
that I just listed.

Thanks, Juan.


>
> I wonder about the following patch instead
> (Note i've removed the const on the structure here); I wonder how this
> does performance wise for everyone:
>
>
> From abc7da46736b18b6138868ccc0b11901169e1dfd Mon Sep 17 00:00:00 2001
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Date: Mon, 16 May 2022 19:54:31 +0100
> Subject: [PATCH] target-page: Maintain target_page variable even for
>  non-variable
> Content-type: text/plain
>
> On architectures that define TARGET_PAGE_BITS_VARY, the 'target_page'
> structure gets filled in at run time by the number of bits and the
> TARGET_PAGE_BITS and TARGET_PAGE macros use that rather than being
> constant.
>
> On non-variable pagesize systems target_page is not filled in, and we
> rely on TARGET_PAGE_SIZE being compile time defined.
>
> The problem is that for source files that are target-independent
> they end up calling qemu_target_page_size to read the size, and that
> function call is annoying.
>
> Improve this by always filling in 'target_page' even for non-variable
> size CPUs, and inlining the functions that previously returned
> the macro values (that may have been constant) to return the
> values read from target_page.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-05-18  9:00 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-10 22:42 [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Juan Quintela
2022-05-10 22:42 ` [PATCH v6 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params Juan Quintela
2022-05-16 13:14   ` Dr. David Alan Gilbert
2022-05-18  8:40     ` Juan Quintela
2022-05-10 22:42 ` [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv, Send}Params Juan Quintela
2022-05-17  8:44   ` [PATCH v6 02/13] multifd: Create page_size fields into both MultiFD{Recv,Send}Params Dr. David Alan Gilbert
2022-05-18  8:48     ` Juan Quintela
2022-05-10 22:42 ` [PATCH v6 03/13] multifd: Create page_count fields into both MultiFD{Recv, Send}Params Juan Quintela
2022-05-10 22:42 ` [PATCH v6 04/13] migration: Export ram_transferred_ram() Juan Quintela
2022-05-10 22:42 ` [PATCH v6 05/13] multifd: Count the number of bytes sent correctly Juan Quintela
2022-05-10 22:42 ` [PATCH v6 06/13] migration: Make ram_save_target_page() a pointer Juan Quintela
2022-05-10 22:42 ` [PATCH v6 07/13] multifd: Make flags field thread local Juan Quintela
2022-05-10 22:42 ` [PATCH v6 08/13] multifd: Prepare to send a packet without the mutex held Juan Quintela
2022-05-10 22:42 ` [PATCH v6 09/13] multifd: Add property to enable/disable zero_page Juan Quintela
2022-05-10 22:42 ` [PATCH v6 10/13] migration: Export ram_release_page() Juan Quintela
2022-05-10 22:42 ` [PATCH v6 11/13] multifd: Support for zero pages transmission Juan Quintela
2022-05-10 22:42 ` [PATCH v6 12/13] multifd: Zero " Juan Quintela
2022-05-10 22:42 ` [PATCH v6 13/13] migration: Use multifd before we check for the zero page Juan Quintela
2022-05-12 13:40 ` [PATCH v6 00/13] Migration: Transmit and detect zero pages in the multifd threads Dr. David Alan Gilbert
2022-05-16 10:45   ` Juan Quintela

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.