On Thu, Mar 26, 2015 at 06:38:18PM +0100, Paolo Bonzini wrote: > QEMU is currently accessing the dirty bitmaps very liberally, > which is understandable since the accesses are cheap. This is > however not good for squeezing maximum performance out of dataplane, > and is also not good if the accesses become more expensive---as is > the case when they use atomic primitives. > > Patches 1-2 make acpi-build.h only use public memory APIs. > > Patches 3-7 optimize access to the VGA dirty bitmap, by restricting > it to video RAM only. > > Patches 8-15 optimize access to the code and migration bitmaps, > by tracking them respectively if TCG is enabled and if migration > is in progress. Note that the first iteration of migration already > does not look at the migration bitmap (commit 70c8652, migration: > do not search dirty pages in bulk stage, 2013-03-26). > > Patches 16-21 are Stefan's patches to convert bitmap access to use > atomic primitives. > > While the main purpose of these patches is a working dirty bitmap > for dataplane (and possibly multithreaded TCG), there's something > that they are immediately useful for: patch 22 makes the migration > thread synchronize the bitmap outside the big QEMU lock, thus > removing the last source of jitter during the RAM copy phase of > migration. > > Please review and test! (it's available as branch "atomic-dirty" > on my github repository) In particular, I suspect that the > postcopy patches might be good at finding bugs. > > Paolo > > Paolo Bonzini (16): > memory: add memory_region_ram_resize > acpi-build: remove dependency from ram_addr.h > memory: the only dirty memory flag for users is DIRTY_MEMORY_VGA > display: enable DIRTY_MEMORY_VGA tracking explicitly > memory: return bitmap from memory_region_is_logging > framebuffer: check memory_region_is_logging > ui/console: check memory_region_is_logging > memory: track DIRTY_MEMORY_CODE in mr->dirty_log_mask > memory: return DIRTY_MEMORY_MIGRATION from memory_region_is_logging > ram_addr: tweaks to xen_modified_memory > exec: simplify notdirty_mem_write > exec: use memory_region_is_logging to optimize dirty tracking > exec: pass client mask to cpu_physical_memory_set_dirty_range > exec: only check relevant bitmaps for cleanliness > memory: do not touch code dirty bitmap unless TCG is enabled > migration: run bitmap sync outside iothread lock > > Stefan Hajnoczi (6): > bitmap: add atomic set functions > bitmap: add atomic test and clear > memory: use atomic ops for setting dirty memory bits > migration: move dirty bitmap sync to ram_addr.h > memory: replace cpu_physical_memory_reset_dirty() with test-and-clear > memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic > > arch_init.c | 56 +++---------------- > cputlb.c | 4 +- > exec.c | 103 ++++++++++++++++------------------ > hw/core/loader.c | 8 +-- > hw/display/cg3.c | 1 + > hw/display/exynos4210_fimd.c | 7 ++- > hw/display/framebuffer.c | 23 ++++++-- > hw/display/g364fb.c | 2 +- > hw/display/sm501.c | 1 + > hw/display/tcx.c | 1 + > hw/display/vmware_vga.c | 2 +- > hw/i386/acpi-build.c | 36 ++++++------ > hw/virtio/vhost.c | 3 +- > include/exec/memory.h | 27 +++++++-- > include/exec/ram_addr.h | 128 ++++++++++++++++++++++++++++--------------- > include/hw/loader.h | 8 ++- > include/qemu/bitmap.h | 4 ++ > include/qemu/bitops.h | 14 +++++ > kvm-all.c | 3 +- > memory.c | 34 ++++++++---- > ui/console.c | 14 +++-- > util/bitmap.c | 78 ++++++++++++++++++++++++++ > xen-hvm.c | 3 +- > 23 files changed, 356 insertions(+), 204 deletions(-) Modulo the comments that have already been posted: Reviewed-by: Stefan Hajnoczi