* [Qemu-devel] [PATCH v5 0/5] virtio-balloon: free page hint reporting support @ 2018-03-16 10:48 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel This is the deivce part implementation to add a new feature, VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device receives the guest free page hints from the driver and clears the corresponding bits in the dirty bitmap, so that those free pages are not transferred by the migration thread to the destination. - Test Environment Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz Guest: 8G RAM, 4 vCPU Migration setup: migrate_set_speed 100G, migrate_set_downtime 2 second - Test Results - Idle Guest Live Migration Time (results are averaged over 10 runs): - Optimization v.s. Legacy = 261ms vs 1769ms --> ~86% reduction - Guest with Linux Compilation Workload (make bzImage -j4): - Live Migration Time (average) Optimization v.s. Legacy = 1260ms v.s. 2634ms --> ~51% reduction - Linux Compilation Time Optimization v.s. Legacy = 4min58s v.s. 5min3s --> no obvious difference - Source Code - QEMU: https://github.com/wei-w-wang/qemu-free-page-lm.git - Linux: https://github.com/wei-w-wang/linux-free-page-lm.git ChangeLog: v4->v5: 1) migration: - bitmap_clear_dirty: update the dirty bitmap and dirty page count under the bitmap mutex as what other functions are doing; - qemu_guest_free_page_hint: - add comments for this function; - check the !block case; - check "offset > block->used_length" before proceed; - assign used_len inside the for{} body; - update the dirty bitmap and dirty page counter under the bitmap mutex; - ram_state_reset: - rs->free_page_support: && with use "migrate_postcopy" instead of migration_in_postcopy; - clear the ram_bulk_stage flag if free_page_support is true; 2) balloon: - add the usage documentation of balloon_free_page_start and balloon_free_page_stop in code; - the optimization thread is named "balloon_fpo" to meet the requirement of "less than 14 characters"; - virtio_balloon_poll_free_page_hints: - run on condition when runstate_is_running() is true; - add a qemu spin lock to synchronize accesses to the free page reporting related fields shared among the migration thread and the optimization thread; - virtio_balloon_free_page_start: just return if runstate_is_running is false; - virtio_balloon_free_page_stop: access to the free page reporting related fields under a qemu spin lock; - virtio_balloon_device_unrealize/reset: call virtio_balloon_free_page_stop is the free page hint feature is used; - virtio_balloon_set_status: call irtio_balloon_free_page_stop in case the guest is stopped by qmp when the optimization is running; v3->v4: 1) bitmap: add a new API to count 1s starting from an offset of a bitmap 2) migration: - qemu_guest_free_page_hint: calculate ram_state->migration_dirty_pages by counting how many bits of free pages are truely cleared. If some of the bits were already 0, they shouldn't be deducted by ram_state->migration_dirty_pages. This wasn't needed for previous versions since we optimized bulk stage only, where all bits are guaranteed to be set. It's needed now because we extened the usage of this optimizaton to all stages except the last stop© stage. From 2nd stage onward, there are possibilities that some bits of free pages are already 0. 3) virtio-balloon: - virtio_balloon_free_page_report_status: introduce a new status, FREE_PAGE_REPORT_S_EXIT. This status indicates that the optimization thread has exited. FREE_PAGE_REPORT_S_STOP means the reporting is stopped, but the optimization thread still needs to be joined by the migration thread. v2->v3: 1) virtio-balloon - virtio_balloon_free_page_start: poll the hints using a new thread; - use cmd id between [0x80000000, UINT_MAX]; - virtio_balloon_poll_free_page_hints: - stop the optimization only when it has started; - don't skip free pages when !poison_val; - add poison_val to vmsd to migrate; - virtio_balloon_get_features: add the F_PAGE_POISON feature when host has F_FREE_PAGE_HINT; - remove the timer patch which is not needed now. 2) migration - new api, qemu_guest_free_page_hint; - rs->free_page_support set only in the precopy case; - use the new balloon APIs. v1->v2: 1) virtio-balloon - use subsections to save free_page_report_cmd_id; - poll the free page vq after sending a cmd id to the driver; - change the free page vq size to VIRTQUEUE_MAX_SIZE; - virtio_balloon_poll_free_page_hints: handle the corner case that the free page block reported from the driver may cross the RAMBlock boundary. 2) migration/ram.c - use balloon_free_page_poll to start the optimization Wei Wang (5): bitmap: bitmap_count_one_with_offset migration: use bitmap_mutex in migration_bitmap_clear_dirty migration: API to clear bits of guest free pages from the dirty bitmap virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT migration: use the free page hint feature from balloon balloon.c | 58 +++++-- hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- include/hw/virtio/virtio-balloon.h | 20 ++- include/migration/misc.h | 2 + include/qemu/bitmap.h | 13 ++ include/standard-headers/linux/virtio_balloon.h | 7 + include/sysemu/balloon.h | 15 +- migration/ram.c | 73 +++++++- 8 files changed, 375 insertions(+), 30 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 42+ messages in thread
* [virtio-dev] [PATCH v5 0/5] virtio-balloon: free page hint reporting support @ 2018-03-16 10:48 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel This is the deivce part implementation to add a new feature, VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device receives the guest free page hints from the driver and clears the corresponding bits in the dirty bitmap, so that those free pages are not transferred by the migration thread to the destination. - Test Environment Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz Guest: 8G RAM, 4 vCPU Migration setup: migrate_set_speed 100G, migrate_set_downtime 2 second - Test Results - Idle Guest Live Migration Time (results are averaged over 10 runs): - Optimization v.s. Legacy = 261ms vs 1769ms --> ~86% reduction - Guest with Linux Compilation Workload (make bzImage -j4): - Live Migration Time (average) Optimization v.s. Legacy = 1260ms v.s. 2634ms --> ~51% reduction - Linux Compilation Time Optimization v.s. Legacy = 4min58s v.s. 5min3s --> no obvious difference - Source Code - QEMU: https://github.com/wei-w-wang/qemu-free-page-lm.git - Linux: https://github.com/wei-w-wang/linux-free-page-lm.git ChangeLog: v4->v5: 1) migration: - bitmap_clear_dirty: update the dirty bitmap and dirty page count under the bitmap mutex as what other functions are doing; - qemu_guest_free_page_hint: - add comments for this function; - check the !block case; - check "offset > block->used_length" before proceed; - assign used_len inside the for{} body; - update the dirty bitmap and dirty page counter under the bitmap mutex; - ram_state_reset: - rs->free_page_support: && with use "migrate_postcopy" instead of migration_in_postcopy; - clear the ram_bulk_stage flag if free_page_support is true; 2) balloon: - add the usage documentation of balloon_free_page_start and balloon_free_page_stop in code; - the optimization thread is named "balloon_fpo" to meet the requirement of "less than 14 characters"; - virtio_balloon_poll_free_page_hints: - run on condition when runstate_is_running() is true; - add a qemu spin lock to synchronize accesses to the free page reporting related fields shared among the migration thread and the optimization thread; - virtio_balloon_free_page_start: just return if runstate_is_running is false; - virtio_balloon_free_page_stop: access to the free page reporting related fields under a qemu spin lock; - virtio_balloon_device_unrealize/reset: call virtio_balloon_free_page_stop is the free page hint feature is used; - virtio_balloon_set_status: call irtio_balloon_free_page_stop in case the guest is stopped by qmp when the optimization is running; v3->v4: 1) bitmap: add a new API to count 1s starting from an offset of a bitmap 2) migration: - qemu_guest_free_page_hint: calculate ram_state->migration_dirty_pages by counting how many bits of free pages are truely cleared. If some of the bits were already 0, they shouldn't be deducted by ram_state->migration_dirty_pages. This wasn't needed for previous versions since we optimized bulk stage only, where all bits are guaranteed to be set. It's needed now because we extened the usage of this optimizaton to all stages except the last stop© stage. From 2nd stage onward, there are possibilities that some bits of free pages are already 0. 3) virtio-balloon: - virtio_balloon_free_page_report_status: introduce a new status, FREE_PAGE_REPORT_S_EXIT. This status indicates that the optimization thread has exited. FREE_PAGE_REPORT_S_STOP means the reporting is stopped, but the optimization thread still needs to be joined by the migration thread. v2->v3: 1) virtio-balloon - virtio_balloon_free_page_start: poll the hints using a new thread; - use cmd id between [0x80000000, UINT_MAX]; - virtio_balloon_poll_free_page_hints: - stop the optimization only when it has started; - don't skip free pages when !poison_val; - add poison_val to vmsd to migrate; - virtio_balloon_get_features: add the F_PAGE_POISON feature when host has F_FREE_PAGE_HINT; - remove the timer patch which is not needed now. 2) migration - new api, qemu_guest_free_page_hint; - rs->free_page_support set only in the precopy case; - use the new balloon APIs. v1->v2: 1) virtio-balloon - use subsections to save free_page_report_cmd_id; - poll the free page vq after sending a cmd id to the driver; - change the free page vq size to VIRTQUEUE_MAX_SIZE; - virtio_balloon_poll_free_page_hints: handle the corner case that the free page block reported from the driver may cross the RAMBlock boundary. 2) migration/ram.c - use balloon_free_page_poll to start the optimization Wei Wang (5): bitmap: bitmap_count_one_with_offset migration: use bitmap_mutex in migration_bitmap_clear_dirty migration: API to clear bits of guest free pages from the dirty bitmap virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT migration: use the free page hint feature from balloon balloon.c | 58 +++++-- hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- include/hw/virtio/virtio-balloon.h | 20 ++- include/migration/misc.h | 2 + include/qemu/bitmap.h | 13 ++ include/standard-headers/linux/virtio_balloon.h | 7 + include/sysemu/balloon.h | 15 +- migration/ram.c | 73 +++++++- 8 files changed, 375 insertions(+), 30 deletions(-) -- 1.8.3.1 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* [Qemu-devel] [PATCH v5 1/5] bitmap: bitmap_count_one_with_offset 2018-03-16 10:48 ` [virtio-dev] " Wei Wang @ 2018-03-16 10:48 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel Count the number of 1s in a bitmap starting from an offset. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- include/qemu/bitmap.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h index 509eedd..e3f31f1 100644 --- a/include/qemu/bitmap.h +++ b/include/qemu/bitmap.h @@ -228,6 +228,19 @@ static inline long bitmap_count_one(const unsigned long *bitmap, long nbits) } } +static inline long bitmap_count_one_with_offset(const unsigned long *bitmap, + long offset, long nbits) +{ + long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG); + long redundant_bits = offset - aligned_offset; + long bits_to_count = nbits + redundant_bits; + const unsigned long *bitmap_start = bitmap + + aligned_offset / BITS_PER_LONG; + + return bitmap_count_one(bitmap_start, bits_to_count) - + bitmap_count_one(bitmap_start, redundant_bits); +} + void bitmap_set(unsigned long *map, long i, long len); void bitmap_set_atomic(unsigned long *map, long i, long len); void bitmap_clear(unsigned long *map, long start, long nr); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [virtio-dev] [PATCH v5 1/5] bitmap: bitmap_count_one_with_offset @ 2018-03-16 10:48 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel Count the number of 1s in a bitmap starting from an offset. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- include/qemu/bitmap.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h index 509eedd..e3f31f1 100644 --- a/include/qemu/bitmap.h +++ b/include/qemu/bitmap.h @@ -228,6 +228,19 @@ static inline long bitmap_count_one(const unsigned long *bitmap, long nbits) } } +static inline long bitmap_count_one_with_offset(const unsigned long *bitmap, + long offset, long nbits) +{ + long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG); + long redundant_bits = offset - aligned_offset; + long bits_to_count = nbits + redundant_bits; + const unsigned long *bitmap_start = bitmap + + aligned_offset / BITS_PER_LONG; + + return bitmap_count_one(bitmap_start, bits_to_count) - + bitmap_count_one(bitmap_start, redundant_bits); +} + void bitmap_set(unsigned long *map, long i, long len); void bitmap_set_atomic(unsigned long *map, long i, long len); void bitmap_clear(unsigned long *map, long start, long nr); -- 1.8.3.1 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [Qemu-devel] [PATCH v5 2/5] migration: use bitmap_mutex in migration_bitmap_clear_dirty 2018-03-16 10:48 ` [virtio-dev] " Wei Wang @ 2018-03-16 10:48 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel The bitmap mutex is used to synchronize threads to update the dirty bitmap and the migration_dirty_pages count. This patch makes migration_bitmap_clear_dirty update the bitmap and count under the mutex. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> --- migration/ram.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 7266351..38c991d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -790,11 +790,14 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs, { bool ret; + qemu_mutex_lock(&rs->bitmap_mutex); ret = test_and_clear_bit(page, rb->bmap); if (ret) { rs->migration_dirty_pages--; } + qemu_mutex_unlock(&rs->bitmap_mutex); + return ret; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [virtio-dev] [PATCH v5 2/5] migration: use bitmap_mutex in migration_bitmap_clear_dirty @ 2018-03-16 10:48 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel The bitmap mutex is used to synchronize threads to update the dirty bitmap and the migration_dirty_pages count. This patch makes migration_bitmap_clear_dirty update the bitmap and count under the mutex. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> --- migration/ram.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 7266351..38c991d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -790,11 +790,14 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs, { bool ret; + qemu_mutex_lock(&rs->bitmap_mutex); ret = test_and_clear_bit(page, rb->bmap); if (ret) { rs->migration_dirty_pages--; } + qemu_mutex_unlock(&rs->bitmap_mutex); + return ret; } -- 1.8.3.1 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [Qemu-devel] [PATCH v5 3/5] migration: API to clear bits of guest free pages from the dirty bitmap 2018-03-16 10:48 ` [virtio-dev] " Wei Wang @ 2018-03-16 10:48 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel This patch adds an API to clear bits corresponding to guest free pages from the dirty bitmap. Spilt the free page block if it crosses the QEMU RAMBlock boundary. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> --- include/migration/misc.h | 2 ++ migration/ram.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/include/migration/misc.h b/include/migration/misc.h index 77fd4f5..fae1acf 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -14,11 +14,13 @@ #ifndef MIGRATION_MISC_H #define MIGRATION_MISC_H +#include "exec/cpu-common.h" #include "qemu/notify.h" /* migration/ram.c */ void ram_mig_init(void); +void qemu_guest_free_page_hint(void *addr, size_t len); /* migration/block.c */ diff --git a/migration/ram.c b/migration/ram.c index 38c991d..2e82181 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2193,6 +2193,50 @@ static int ram_init_all(RAMState **rsp) } /* + * This function clears bits of the free pages reported by the caller from the + * migration dirty bitmap. @addr is the host address corresponding to the + * start of the continuous guest free pages, and @len is the total bytes of + * those pages. + */ +void qemu_guest_free_page_hint(void *addr, size_t len) +{ + RAMBlock *block; + ram_addr_t offset; + size_t used_len, start, npages; + + for (; len > 0; len -= used_len) { + block = qemu_ram_block_from_host(addr, false, &offset); + if (unlikely(!block)) { + return; + } + + /* + * This handles the case that the RAMBlock is resized after the free + * page hint is reported. + */ + if (unlikely(offset > block->used_length)) { + return; + } + + if (len <= block->used_length - offset) { + used_len = len; + } else { + used_len = block->used_length - offset; + addr += used_len; + } + + start = offset >> TARGET_PAGE_BITS; + npages = used_len >> TARGET_PAGE_BITS; + + qemu_mutex_lock(&ram_state->bitmap_mutex); + ram_state->migration_dirty_pages -= + bitmap_count_one_with_offset(block->bmap, start, npages); + bitmap_clear(block->bmap, start, npages); + qemu_mutex_unlock(&ram_state->bitmap_mutex); + } +} + +/* * Each of ram_save_setup, ram_save_iterate and ram_save_complete has * long-running RCU critical section. When rcu-reclaims in the code * start to become numerous it will be necessary to reduce the -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [virtio-dev] [PATCH v5 3/5] migration: API to clear bits of guest free pages from the dirty bitmap @ 2018-03-16 10:48 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel This patch adds an API to clear bits corresponding to guest free pages from the dirty bitmap. Spilt the free page block if it crosses the QEMU RAMBlock boundary. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> --- include/migration/misc.h | 2 ++ migration/ram.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/include/migration/misc.h b/include/migration/misc.h index 77fd4f5..fae1acf 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -14,11 +14,13 @@ #ifndef MIGRATION_MISC_H #define MIGRATION_MISC_H +#include "exec/cpu-common.h" #include "qemu/notify.h" /* migration/ram.c */ void ram_mig_init(void); +void qemu_guest_free_page_hint(void *addr, size_t len); /* migration/block.c */ diff --git a/migration/ram.c b/migration/ram.c index 38c991d..2e82181 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2193,6 +2193,50 @@ static int ram_init_all(RAMState **rsp) } /* + * This function clears bits of the free pages reported by the caller from the + * migration dirty bitmap. @addr is the host address corresponding to the + * start of the continuous guest free pages, and @len is the total bytes of + * those pages. + */ +void qemu_guest_free_page_hint(void *addr, size_t len) +{ + RAMBlock *block; + ram_addr_t offset; + size_t used_len, start, npages; + + for (; len > 0; len -= used_len) { + block = qemu_ram_block_from_host(addr, false, &offset); + if (unlikely(!block)) { + return; + } + + /* + * This handles the case that the RAMBlock is resized after the free + * page hint is reported. + */ + if (unlikely(offset > block->used_length)) { + return; + } + + if (len <= block->used_length - offset) { + used_len = len; + } else { + used_len = block->used_length - offset; + addr += used_len; + } + + start = offset >> TARGET_PAGE_BITS; + npages = used_len >> TARGET_PAGE_BITS; + + qemu_mutex_lock(&ram_state->bitmap_mutex); + ram_state->migration_dirty_pages -= + bitmap_count_one_with_offset(block->bmap, start, npages); + bitmap_clear(block->bmap, start, npages); + qemu_mutex_unlock(&ram_state->bitmap_mutex); + } +} + +/* * Each of ram_save_setup, ram_save_iterate and ram_save_complete has * long-running RCU critical section. When rcu-reclaims in the code * start to become numerous it will be necessary to reduce the -- 1.8.3.1 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [Qemu-devel] [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-16 10:48 ` [virtio-dev] " Wei Wang @ 2018-03-16 10:48 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel The new feature enables the virtio-balloon device to receive hints of guest free pages from the free page vq. balloon_free_page_start - start guest free page hint reporting. balloon_free_page_stop - stop guest free page hint reporting. Note: balloon will report pages which were free at the time of this call. As the reporting happens asynchronously, dirty bit logging must be enabled before this call is made. Guest reporting must be disabled before the migration dirty bitmap is synchronized. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Liang Li <liang.z.li@intel.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> --- balloon.c | 58 +++++-- hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- include/hw/virtio/virtio-balloon.h | 20 ++- include/standard-headers/linux/virtio_balloon.h | 7 + include/sysemu/balloon.h | 15 +- 5 files changed, 288 insertions(+), 29 deletions(-) diff --git a/balloon.c b/balloon.c index 6bf0a96..87a0410 100644 --- a/balloon.c +++ b/balloon.c @@ -36,6 +36,9 @@ static QEMUBalloonEvent *balloon_event_fn; static QEMUBalloonStatus *balloon_stat_fn; +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn; +static QEMUBalloonFreePageStart *balloon_free_page_start_fn; +static QEMUBalloonFreePageStop *balloon_free_page_stop_fn; static void *balloon_opaque; static bool balloon_inhibited; @@ -64,19 +67,51 @@ static bool have_balloon(Error **errp) return true; } -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, - QEMUBalloonStatus *stat_func, void *opaque) +bool balloon_free_page_support(void) { - if (balloon_event_fn || balloon_stat_fn || balloon_opaque) { - /* We're already registered one balloon handler. How many can - * a guest really have? - */ - return -1; + return balloon_free_page_support_fn && + balloon_free_page_support_fn(balloon_opaque); +} + +/* + * Balloon will report pages which were free at the time of this call. As the + * reporting happens asynchronously, dirty bit logging must be enabled before + * this call is made. + */ +void balloon_free_page_start(void) +{ + balloon_free_page_start_fn(balloon_opaque); +} + +/* + * Guest reporting must be disabled before the migration dirty bitmap is + * synchronized. + */ +void balloon_free_page_stop(void) +{ + balloon_free_page_stop_fn(balloon_opaque); +} + +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, + QEMUBalloonStatus *stat_fn, + QEMUBalloonFreePageSupport *free_page_support_fn, + QEMUBalloonFreePageStart *free_page_start_fn, + QEMUBalloonFreePageStop *free_page_stop_fn, + void *opaque) +{ + if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn || + balloon_free_page_start_fn || balloon_free_page_stop_fn || + balloon_opaque) { + /* We already registered one balloon handler. */ + return; } - balloon_event_fn = event_func; - balloon_stat_fn = stat_func; + + balloon_event_fn = event_fn; + balloon_stat_fn = stat_fn; + balloon_free_page_support_fn = free_page_support_fn; + balloon_free_page_start_fn = free_page_start_fn; + balloon_free_page_stop_fn = free_page_stop_fn; balloon_opaque = opaque; - return 0; } void qemu_remove_balloon_handler(void *opaque) @@ -86,6 +121,9 @@ void qemu_remove_balloon_handler(void *opaque) } balloon_event_fn = NULL; balloon_stat_fn = NULL; + balloon_free_page_support_fn = NULL; + balloon_free_page_start_fn = NULL; + balloon_free_page_stop_fn = NULL; balloon_opaque = NULL; } diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index f456cea..30c7504 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -31,6 +31,7 @@ #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" +#include "migration/misc.h" #define BALLOON_PAGE_SIZE (1 << VIRTIO_BALLOON_PFN_SHIFT) @@ -308,6 +309,127 @@ out: } } +static void *virtio_balloon_poll_free_page_hints(void *opaque) +{ + VirtQueueElement *elem; + VirtIOBalloon *dev = opaque; + VirtQueue *vq = dev->free_page_vq; + uint32_t id; + size_t size; + + /* The optimization thread runs only when the guest is running. */ + while (runstate_is_running()) { + qemu_spin_lock(&dev->free_page_lock); + /* + * If the migration thread actively stops the reporting, exit + * immediately. + */ + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { + qemu_spin_unlock(&dev->free_page_lock); + break; + } + + elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); + if (!elem) { + qemu_spin_unlock(&dev->free_page_lock); + continue; + } + + if (elem->out_num) { + size = iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(id)); + virtqueue_push(vq, elem, size); + g_free(elem); + if (unlikely(size != sizeof(id))) { + warn_report("%s: received an incorrect cmd id", __func__); + qemu_spin_unlock(&dev->free_page_lock); + break; + } + if (id == dev->free_page_report_cmd_id) { + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; + } else if (dev->free_page_report_status == + FREE_PAGE_REPORT_S_START) { + /* + * Stop the optimization only when it has started. This avoids + * a stale stop sign for the previous command. + */ + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + qemu_spin_unlock(&dev->free_page_lock); + break; + } + } + + if (elem->in_num) { + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && + !dev->poison_val) { + qemu_guest_free_page_hint(elem->in_sg[0].iov_base, + elem->in_sg[0].iov_len); + } + virtqueue_push(vq, elem, 0); + g_free(elem); + } + qemu_spin_unlock(&dev->free_page_lock); + } + return NULL; +} + +static bool virtio_balloon_free_page_support(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT); +} + +static void virtio_balloon_free_page_start(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + if (!runstate_is_running()) { + return; + } + + if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) { + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; + } else { + s->free_page_report_cmd_id++; + } + + s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED; + virtio_notify_config(vdev); + qemu_thread_create(&s->free_page_thread, "balloon_fpo", + virtio_balloon_poll_free_page_hints, s, + QEMU_THREAD_JOINABLE); +} + +static void virtio_balloon_free_page_stop(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + qemu_spin_lock(&s->free_page_lock); + switch (s->free_page_report_status) { + case FREE_PAGE_REPORT_S_REQUESTED: + case FREE_PAGE_REPORT_S_START: + /* + * The guest hasn't done the reporting, so host sends a notification + * to the guest to actively stop the reporting before joining the + * optimization thread. + */ + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + virtio_notify_config(vdev); + case FREE_PAGE_REPORT_S_STOP: + /* The guest has stopped the reporting. Join the optimization thread */ + qemu_thread_join(&s->free_page_thread); + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; + case FREE_PAGE_REPORT_S_EXIT: + /* The optimization thread has gone. No further actions needded. */ + break; + } + qemu_spin_unlock(&s->free_page_lock); +} + static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) { VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); @@ -315,6 +437,15 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) config.num_pages = cpu_to_le32(dev->num_pages); config.actual = cpu_to_le32(dev->actual); + config.poison_val = cpu_to_le32(dev->poison_val); + + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { + config.free_page_report_cmd_id = + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); + } else { + config.free_page_report_cmd_id = + cpu_to_le32(dev->free_page_report_cmd_id); + } trace_virtio_balloon_get_config(config.num_pages, config.actual); memcpy(config_data, &config, sizeof(struct virtio_balloon_config)); @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), &error_abort); } + dev->poison_val = le32_to_cpu(config.poison_val); trace_virtio_balloon_set_config(dev->actual, oldactual); } @@ -377,6 +509,11 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); f |= dev->host_features; virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); + + if (dev->host_features & 1ULL << VIRTIO_BALLOON_F_FREE_PAGE_HINT) { + virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON); + } + return f; } @@ -413,6 +550,18 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id) return 0; } +static const VMStateDescription vmstate_virtio_balloon_free_page_report = { + .name = "virtio-balloon-device/free-page-report", + .version_id = 1, + .minimum_version_id = 1, + .needed = virtio_balloon_free_page_support, + .fields = (VMStateField[]) { + VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon), + VMSTATE_UINT32(poison_val, VirtIOBalloon), + VMSTATE_END_OF_LIST() + } +}; + static const VMStateDescription vmstate_virtio_balloon_device = { .name = "virtio-balloon-device", .version_id = 1, @@ -423,30 +572,31 @@ static const VMStateDescription vmstate_virtio_balloon_device = { VMSTATE_UINT32(actual, VirtIOBalloon), VMSTATE_END_OF_LIST() }, + .subsections = (const VMStateDescription * []) { + &vmstate_virtio_balloon_free_page_report, + NULL + } }; static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOBalloon *s = VIRTIO_BALLOON(dev); - int ret; virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON, sizeof(struct virtio_balloon_config)); - ret = qemu_add_balloon_handler(virtio_balloon_to_target, - virtio_balloon_stat, s); - - if (ret < 0) { - error_setg(errp, "Only one balloon device is supported"); - virtio_cleanup(vdev); - return; - } - s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); - + if (virtio_has_feature(s->host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL); + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN - 1; + qemu_spin_init(&s->free_page_lock); + } reset_stats(s); } @@ -455,6 +605,9 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOBalloon *s = VIRTIO_BALLOON(dev); + if (virtio_balloon_free_page_support(s)) { + virtio_balloon_free_page_stop(s); + } balloon_stats_destroy_timer(s); qemu_remove_balloon_handler(s); virtio_cleanup(vdev); @@ -464,6 +617,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); + if (virtio_balloon_free_page_support(s)) { + virtio_balloon_free_page_stop(s); + } + if (s->stats_vq_elem != NULL) { virtqueue_unpop(s->svq, s->stats_vq_elem, 0); g_free(s->stats_vq_elem); @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); - if (!s->stats_vq_elem && vdev->vm_running && - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { - /* poll stats queue for the element we have discarded when the VM - * was stopped */ - virtio_balloon_receive_stats(vdev, s->svq); + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + if (!s->stats_vq_elem && vdev->vm_running && + virtqueue_rewind(s->svq, 1)) { + /* + * Poll stats queue for the element we have discarded when the VM + * was stopped. + */ + virtio_balloon_receive_stats(vdev, s->svq); + } + + if (virtio_balloon_free_page_support(s)) { + qemu_add_balloon_handler(virtio_balloon_to_target, + virtio_balloon_stat, + virtio_balloon_free_page_support, + virtio_balloon_free_page_start, + virtio_balloon_free_page_stop, + s); + /* + * This handles the case that the guest is being stopped (e.g. by + * qmp commands) while the driver is still reporting hints. When + * the guest is woken up, it will continue to report hints, which + * are not needed. So when the wakeup notifier invokes the + * set_status callback here, we get the chance to make sure that + * the free page optimization thread is exited via + * virtio_balloon_free_page_stop. + */ + virtio_balloon_free_page_stop(s); + } else { + qemu_add_balloon_handler(virtio_balloon_to_target, + virtio_balloon_stat, NULL, NULL, NULL, s); + } } } @@ -509,6 +692,8 @@ static const VMStateDescription vmstate_virtio_balloon = { static Property virtio_balloon_properties[] = { DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), + DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index 1ea13bd..cfdba37 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -23,6 +23,8 @@ #define VIRTIO_BALLOON(obj) \ OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON) +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000 + typedef struct virtio_balloon_stat VirtIOBalloonStat; typedef struct virtio_balloon_stat_modern { @@ -31,15 +33,31 @@ typedef struct virtio_balloon_stat_modern { uint64_t val; } VirtIOBalloonStatModern; +enum virtio_balloon_free_page_report_status { + FREE_PAGE_REPORT_S_REQUESTED, + FREE_PAGE_REPORT_S_START, + FREE_PAGE_REPORT_S_STOP, + FREE_PAGE_REPORT_S_EXIT, +}; + typedef struct VirtIOBalloon { VirtIODevice parent_obj; - VirtQueue *ivq, *dvq, *svq; + VirtQueue *ivq, *dvq, *svq, *free_page_vq; + uint32_t free_page_report_status; uint32_t num_pages; uint32_t actual; + uint32_t free_page_report_cmd_id; + uint32_t poison_val; uint64_t stats[VIRTIO_BALLOON_S_NR]; VirtQueueElement *stats_vq_elem; size_t stats_vq_offset; QEMUTimer *stats_timer; + QemuThread free_page_thread; + /* + * Lock to synchronize threads to access the free page reporting related + * fields (e.g. free_page_report_status). + */ + QemuSpin free_page_lock; int64_t stats_last_update; int64_t stats_poll_interval; uint32_t host_features; diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h index 7b0a41b..f89e80f 100644 --- a/include/standard-headers/linux/virtio_balloon.h +++ b/include/standard-headers/linux/virtio_balloon.h @@ -34,15 +34,22 @@ #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */ #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ +#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0 struct virtio_balloon_config { /* Number of pages host wants Guest to give up. */ uint32_t num_pages; /* Number of pages we've actually got in balloon. */ uint32_t actual; + /* Free page report command id, readonly by guest */ + uint32_t free_page_report_cmd_id; + /* Stores PAGE_POISON if page poisoning is in use */ + uint32_t poison_val; }; #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */ diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h index 66543ae..6561a08 100644 --- a/include/sysemu/balloon.h +++ b/include/sysemu/balloon.h @@ -18,11 +18,22 @@ typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target); typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info); +typedef bool (QEMUBalloonFreePageSupport)(void *opaque); +typedef void (QEMUBalloonFreePageStart)(void *opaque); +typedef void (QEMUBalloonFreePageStop)(void *opaque); -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, - QEMUBalloonStatus *stat_func, void *opaque); void qemu_remove_balloon_handler(void *opaque); bool qemu_balloon_is_inhibited(void); void qemu_balloon_inhibit(bool state); +bool balloon_free_page_support(void); +void balloon_free_page_start(void); +void balloon_free_page_stop(void); + +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, + QEMUBalloonStatus *stat_fn, + QEMUBalloonFreePageSupport *free_page_support_fn, + QEMUBalloonFreePageStart *free_page_start_fn, + QEMUBalloonFreePageStop *free_page_stop_fn, + void *opaque); #endif -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [virtio-dev] [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-16 10:48 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel The new feature enables the virtio-balloon device to receive hints of guest free pages from the free page vq. balloon_free_page_start - start guest free page hint reporting. balloon_free_page_stop - stop guest free page hint reporting. Note: balloon will report pages which were free at the time of this call. As the reporting happens asynchronously, dirty bit logging must be enabled before this call is made. Guest reporting must be disabled before the migration dirty bitmap is synchronized. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Liang Li <liang.z.li@intel.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> --- balloon.c | 58 +++++-- hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- include/hw/virtio/virtio-balloon.h | 20 ++- include/standard-headers/linux/virtio_balloon.h | 7 + include/sysemu/balloon.h | 15 +- 5 files changed, 288 insertions(+), 29 deletions(-) diff --git a/balloon.c b/balloon.c index 6bf0a96..87a0410 100644 --- a/balloon.c +++ b/balloon.c @@ -36,6 +36,9 @@ static QEMUBalloonEvent *balloon_event_fn; static QEMUBalloonStatus *balloon_stat_fn; +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn; +static QEMUBalloonFreePageStart *balloon_free_page_start_fn; +static QEMUBalloonFreePageStop *balloon_free_page_stop_fn; static void *balloon_opaque; static bool balloon_inhibited; @@ -64,19 +67,51 @@ static bool have_balloon(Error **errp) return true; } -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, - QEMUBalloonStatus *stat_func, void *opaque) +bool balloon_free_page_support(void) { - if (balloon_event_fn || balloon_stat_fn || balloon_opaque) { - /* We're already registered one balloon handler. How many can - * a guest really have? - */ - return -1; + return balloon_free_page_support_fn && + balloon_free_page_support_fn(balloon_opaque); +} + +/* + * Balloon will report pages which were free at the time of this call. As the + * reporting happens asynchronously, dirty bit logging must be enabled before + * this call is made. + */ +void balloon_free_page_start(void) +{ + balloon_free_page_start_fn(balloon_opaque); +} + +/* + * Guest reporting must be disabled before the migration dirty bitmap is + * synchronized. + */ +void balloon_free_page_stop(void) +{ + balloon_free_page_stop_fn(balloon_opaque); +} + +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, + QEMUBalloonStatus *stat_fn, + QEMUBalloonFreePageSupport *free_page_support_fn, + QEMUBalloonFreePageStart *free_page_start_fn, + QEMUBalloonFreePageStop *free_page_stop_fn, + void *opaque) +{ + if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn || + balloon_free_page_start_fn || balloon_free_page_stop_fn || + balloon_opaque) { + /* We already registered one balloon handler. */ + return; } - balloon_event_fn = event_func; - balloon_stat_fn = stat_func; + + balloon_event_fn = event_fn; + balloon_stat_fn = stat_fn; + balloon_free_page_support_fn = free_page_support_fn; + balloon_free_page_start_fn = free_page_start_fn; + balloon_free_page_stop_fn = free_page_stop_fn; balloon_opaque = opaque; - return 0; } void qemu_remove_balloon_handler(void *opaque) @@ -86,6 +121,9 @@ void qemu_remove_balloon_handler(void *opaque) } balloon_event_fn = NULL; balloon_stat_fn = NULL; + balloon_free_page_support_fn = NULL; + balloon_free_page_start_fn = NULL; + balloon_free_page_stop_fn = NULL; balloon_opaque = NULL; } diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index f456cea..30c7504 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -31,6 +31,7 @@ #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" +#include "migration/misc.h" #define BALLOON_PAGE_SIZE (1 << VIRTIO_BALLOON_PFN_SHIFT) @@ -308,6 +309,127 @@ out: } } +static void *virtio_balloon_poll_free_page_hints(void *opaque) +{ + VirtQueueElement *elem; + VirtIOBalloon *dev = opaque; + VirtQueue *vq = dev->free_page_vq; + uint32_t id; + size_t size; + + /* The optimization thread runs only when the guest is running. */ + while (runstate_is_running()) { + qemu_spin_lock(&dev->free_page_lock); + /* + * If the migration thread actively stops the reporting, exit + * immediately. + */ + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { + qemu_spin_unlock(&dev->free_page_lock); + break; + } + + elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); + if (!elem) { + qemu_spin_unlock(&dev->free_page_lock); + continue; + } + + if (elem->out_num) { + size = iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(id)); + virtqueue_push(vq, elem, size); + g_free(elem); + if (unlikely(size != sizeof(id))) { + warn_report("%s: received an incorrect cmd id", __func__); + qemu_spin_unlock(&dev->free_page_lock); + break; + } + if (id == dev->free_page_report_cmd_id) { + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; + } else if (dev->free_page_report_status == + FREE_PAGE_REPORT_S_START) { + /* + * Stop the optimization only when it has started. This avoids + * a stale stop sign for the previous command. + */ + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + qemu_spin_unlock(&dev->free_page_lock); + break; + } + } + + if (elem->in_num) { + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && + !dev->poison_val) { + qemu_guest_free_page_hint(elem->in_sg[0].iov_base, + elem->in_sg[0].iov_len); + } + virtqueue_push(vq, elem, 0); + g_free(elem); + } + qemu_spin_unlock(&dev->free_page_lock); + } + return NULL; +} + +static bool virtio_balloon_free_page_support(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT); +} + +static void virtio_balloon_free_page_start(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + if (!runstate_is_running()) { + return; + } + + if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) { + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; + } else { + s->free_page_report_cmd_id++; + } + + s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED; + virtio_notify_config(vdev); + qemu_thread_create(&s->free_page_thread, "balloon_fpo", + virtio_balloon_poll_free_page_hints, s, + QEMU_THREAD_JOINABLE); +} + +static void virtio_balloon_free_page_stop(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + qemu_spin_lock(&s->free_page_lock); + switch (s->free_page_report_status) { + case FREE_PAGE_REPORT_S_REQUESTED: + case FREE_PAGE_REPORT_S_START: + /* + * The guest hasn't done the reporting, so host sends a notification + * to the guest to actively stop the reporting before joining the + * optimization thread. + */ + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + virtio_notify_config(vdev); + case FREE_PAGE_REPORT_S_STOP: + /* The guest has stopped the reporting. Join the optimization thread */ + qemu_thread_join(&s->free_page_thread); + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; + case FREE_PAGE_REPORT_S_EXIT: + /* The optimization thread has gone. No further actions needded. */ + break; + } + qemu_spin_unlock(&s->free_page_lock); +} + static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) { VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); @@ -315,6 +437,15 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) config.num_pages = cpu_to_le32(dev->num_pages); config.actual = cpu_to_le32(dev->actual); + config.poison_val = cpu_to_le32(dev->poison_val); + + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { + config.free_page_report_cmd_id = + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); + } else { + config.free_page_report_cmd_id = + cpu_to_le32(dev->free_page_report_cmd_id); + } trace_virtio_balloon_get_config(config.num_pages, config.actual); memcpy(config_data, &config, sizeof(struct virtio_balloon_config)); @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), &error_abort); } + dev->poison_val = le32_to_cpu(config.poison_val); trace_virtio_balloon_set_config(dev->actual, oldactual); } @@ -377,6 +509,11 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); f |= dev->host_features; virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); + + if (dev->host_features & 1ULL << VIRTIO_BALLOON_F_FREE_PAGE_HINT) { + virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON); + } + return f; } @@ -413,6 +550,18 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id) return 0; } +static const VMStateDescription vmstate_virtio_balloon_free_page_report = { + .name = "virtio-balloon-device/free-page-report", + .version_id = 1, + .minimum_version_id = 1, + .needed = virtio_balloon_free_page_support, + .fields = (VMStateField[]) { + VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon), + VMSTATE_UINT32(poison_val, VirtIOBalloon), + VMSTATE_END_OF_LIST() + } +}; + static const VMStateDescription vmstate_virtio_balloon_device = { .name = "virtio-balloon-device", .version_id = 1, @@ -423,30 +572,31 @@ static const VMStateDescription vmstate_virtio_balloon_device = { VMSTATE_UINT32(actual, VirtIOBalloon), VMSTATE_END_OF_LIST() }, + .subsections = (const VMStateDescription * []) { + &vmstate_virtio_balloon_free_page_report, + NULL + } }; static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOBalloon *s = VIRTIO_BALLOON(dev); - int ret; virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON, sizeof(struct virtio_balloon_config)); - ret = qemu_add_balloon_handler(virtio_balloon_to_target, - virtio_balloon_stat, s); - - if (ret < 0) { - error_setg(errp, "Only one balloon device is supported"); - virtio_cleanup(vdev); - return; - } - s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); - + if (virtio_has_feature(s->host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL); + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN - 1; + qemu_spin_init(&s->free_page_lock); + } reset_stats(s); } @@ -455,6 +605,9 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOBalloon *s = VIRTIO_BALLOON(dev); + if (virtio_balloon_free_page_support(s)) { + virtio_balloon_free_page_stop(s); + } balloon_stats_destroy_timer(s); qemu_remove_balloon_handler(s); virtio_cleanup(vdev); @@ -464,6 +617,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); + if (virtio_balloon_free_page_support(s)) { + virtio_balloon_free_page_stop(s); + } + if (s->stats_vq_elem != NULL) { virtqueue_unpop(s->svq, s->stats_vq_elem, 0); g_free(s->stats_vq_elem); @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); - if (!s->stats_vq_elem && vdev->vm_running && - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { - /* poll stats queue for the element we have discarded when the VM - * was stopped */ - virtio_balloon_receive_stats(vdev, s->svq); + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + if (!s->stats_vq_elem && vdev->vm_running && + virtqueue_rewind(s->svq, 1)) { + /* + * Poll stats queue for the element we have discarded when the VM + * was stopped. + */ + virtio_balloon_receive_stats(vdev, s->svq); + } + + if (virtio_balloon_free_page_support(s)) { + qemu_add_balloon_handler(virtio_balloon_to_target, + virtio_balloon_stat, + virtio_balloon_free_page_support, + virtio_balloon_free_page_start, + virtio_balloon_free_page_stop, + s); + /* + * This handles the case that the guest is being stopped (e.g. by + * qmp commands) while the driver is still reporting hints. When + * the guest is woken up, it will continue to report hints, which + * are not needed. So when the wakeup notifier invokes the + * set_status callback here, we get the chance to make sure that + * the free page optimization thread is exited via + * virtio_balloon_free_page_stop. + */ + virtio_balloon_free_page_stop(s); + } else { + qemu_add_balloon_handler(virtio_balloon_to_target, + virtio_balloon_stat, NULL, NULL, NULL, s); + } } } @@ -509,6 +692,8 @@ static const VMStateDescription vmstate_virtio_balloon = { static Property virtio_balloon_properties[] = { DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), + DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index 1ea13bd..cfdba37 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -23,6 +23,8 @@ #define VIRTIO_BALLOON(obj) \ OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON) +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000 + typedef struct virtio_balloon_stat VirtIOBalloonStat; typedef struct virtio_balloon_stat_modern { @@ -31,15 +33,31 @@ typedef struct virtio_balloon_stat_modern { uint64_t val; } VirtIOBalloonStatModern; +enum virtio_balloon_free_page_report_status { + FREE_PAGE_REPORT_S_REQUESTED, + FREE_PAGE_REPORT_S_START, + FREE_PAGE_REPORT_S_STOP, + FREE_PAGE_REPORT_S_EXIT, +}; + typedef struct VirtIOBalloon { VirtIODevice parent_obj; - VirtQueue *ivq, *dvq, *svq; + VirtQueue *ivq, *dvq, *svq, *free_page_vq; + uint32_t free_page_report_status; uint32_t num_pages; uint32_t actual; + uint32_t free_page_report_cmd_id; + uint32_t poison_val; uint64_t stats[VIRTIO_BALLOON_S_NR]; VirtQueueElement *stats_vq_elem; size_t stats_vq_offset; QEMUTimer *stats_timer; + QemuThread free_page_thread; + /* + * Lock to synchronize threads to access the free page reporting related + * fields (e.g. free_page_report_status). + */ + QemuSpin free_page_lock; int64_t stats_last_update; int64_t stats_poll_interval; uint32_t host_features; diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h index 7b0a41b..f89e80f 100644 --- a/include/standard-headers/linux/virtio_balloon.h +++ b/include/standard-headers/linux/virtio_balloon.h @@ -34,15 +34,22 @@ #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */ #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ +#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0 struct virtio_balloon_config { /* Number of pages host wants Guest to give up. */ uint32_t num_pages; /* Number of pages we've actually got in balloon. */ uint32_t actual; + /* Free page report command id, readonly by guest */ + uint32_t free_page_report_cmd_id; + /* Stores PAGE_POISON if page poisoning is in use */ + uint32_t poison_val; }; #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */ diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h index 66543ae..6561a08 100644 --- a/include/sysemu/balloon.h +++ b/include/sysemu/balloon.h @@ -18,11 +18,22 @@ typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target); typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info); +typedef bool (QEMUBalloonFreePageSupport)(void *opaque); +typedef void (QEMUBalloonFreePageStart)(void *opaque); +typedef void (QEMUBalloonFreePageStop)(void *opaque); -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, - QEMUBalloonStatus *stat_func, void *opaque); void qemu_remove_balloon_handler(void *opaque); bool qemu_balloon_is_inhibited(void); void qemu_balloon_inhibit(bool state); +bool balloon_free_page_support(void); +void balloon_free_page_start(void); +void balloon_free_page_stop(void); + +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, + QEMUBalloonStatus *stat_fn, + QEMUBalloonFreePageSupport *free_page_support_fn, + QEMUBalloonFreePageStart *free_page_start_fn, + QEMUBalloonFreePageStop *free_page_stop_fn, + void *opaque); #endif -- 1.8.3.1 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-16 10:48 ` [virtio-dev] " Wei Wang @ 2018-03-16 15:16 ` Michael S. Tsirkin -1 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-16 15:16 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > The new feature enables the virtio-balloon device to receive hints of > guest free pages from the free page vq. > > balloon_free_page_start - start guest free page hint reporting. > balloon_free_page_stop - stop guest free page hint reporting. > > Note: balloon will report pages which were free at the time > of this call. As the reporting happens asynchronously, dirty bit logging > must be enabled before this call is made. Guest reporting must be > disabled before the migration dirty bitmap is synchronized. > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > Signed-off-by: Liang Li <liang.z.li@intel.com> > CC: Michael S. Tsirkin <mst@redhat.com> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com> > CC: Juan Quintela <quintela@redhat.com> > --- > balloon.c | 58 +++++-- > hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- > include/hw/virtio/virtio-balloon.h | 20 ++- > include/standard-headers/linux/virtio_balloon.h | 7 + > include/sysemu/balloon.h | 15 +- > 5 files changed, 288 insertions(+), 29 deletions(-) > > diff --git a/balloon.c b/balloon.c > index 6bf0a96..87a0410 100644 > --- a/balloon.c > +++ b/balloon.c > @@ -36,6 +36,9 @@ > > static QEMUBalloonEvent *balloon_event_fn; > static QEMUBalloonStatus *balloon_stat_fn; > +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn; > +static QEMUBalloonFreePageStart *balloon_free_page_start_fn; > +static QEMUBalloonFreePageStop *balloon_free_page_stop_fn; > static void *balloon_opaque; > static bool balloon_inhibited; > > @@ -64,19 +67,51 @@ static bool have_balloon(Error **errp) > return true; > } > > -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, > - QEMUBalloonStatus *stat_func, void *opaque) > +bool balloon_free_page_support(void) > { > - if (balloon_event_fn || balloon_stat_fn || balloon_opaque) { > - /* We're already registered one balloon handler. How many can > - * a guest really have? > - */ > - return -1; > + return balloon_free_page_support_fn && > + balloon_free_page_support_fn(balloon_opaque); > +} > + > +/* > + * Balloon will report pages which were free at the time of this call. As the > + * reporting happens asynchronously, dirty bit logging must be enabled before > + * this call is made. > + */ > +void balloon_free_page_start(void) > +{ > + balloon_free_page_start_fn(balloon_opaque); > +} > + > +/* > + * Guest reporting must be disabled before the migration dirty bitmap is > + * synchronized. > + */ > +void balloon_free_page_stop(void) > +{ > + balloon_free_page_stop_fn(balloon_opaque); > +} > + > +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, > + QEMUBalloonStatus *stat_fn, > + QEMUBalloonFreePageSupport *free_page_support_fn, > + QEMUBalloonFreePageStart *free_page_start_fn, > + QEMUBalloonFreePageStop *free_page_stop_fn, > + void *opaque) > +{ > + if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn || > + balloon_free_page_start_fn || balloon_free_page_stop_fn || > + balloon_opaque) { > + /* We already registered one balloon handler. */ > + return; > } > - balloon_event_fn = event_func; > - balloon_stat_fn = stat_func; > + > + balloon_event_fn = event_fn; > + balloon_stat_fn = stat_fn; > + balloon_free_page_support_fn = free_page_support_fn; > + balloon_free_page_start_fn = free_page_start_fn; > + balloon_free_page_stop_fn = free_page_stop_fn; > balloon_opaque = opaque; > - return 0; > } > > void qemu_remove_balloon_handler(void *opaque) > @@ -86,6 +121,9 @@ void qemu_remove_balloon_handler(void *opaque) > } > balloon_event_fn = NULL; > balloon_stat_fn = NULL; > + balloon_free_page_support_fn = NULL; > + balloon_free_page_start_fn = NULL; > + balloon_free_page_stop_fn = NULL; > balloon_opaque = NULL; > } > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > index f456cea..30c7504 100644 > --- a/hw/virtio/virtio-balloon.c > +++ b/hw/virtio/virtio-balloon.c > @@ -31,6 +31,7 @@ > > #include "hw/virtio/virtio-bus.h" > #include "hw/virtio/virtio-access.h" > +#include "migration/misc.h" > > #define BALLOON_PAGE_SIZE (1 << VIRTIO_BALLOON_PFN_SHIFT) > > @@ -308,6 +309,127 @@ out: > } > } > > +static void *virtio_balloon_poll_free_page_hints(void *opaque) > +{ > + VirtQueueElement *elem; > + VirtIOBalloon *dev = opaque; > + VirtQueue *vq = dev->free_page_vq; > + uint32_t id; > + size_t size; > + > + /* The optimization thread runs only when the guest is running. */ > + while (runstate_is_running()) { Note that this check is not guaranteed to be correct when checked like this outside BQL. I think you are better off relying on status callback to synchronise with the backend thread. > + qemu_spin_lock(&dev->free_page_lock); > + /* > + * If the migration thread actively stops the reporting, exit > + * immediately. > + */ > + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { > + qemu_spin_unlock(&dev->free_page_lock); > + break; > + } > + > + elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); > + if (!elem) { > + qemu_spin_unlock(&dev->free_page_lock); > + continue; > + } > + > + if (elem->out_num) { > + size = iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(id)); > + virtqueue_push(vq, elem, size); > + g_free(elem); > + if (unlikely(size != sizeof(id))) { > + warn_report("%s: received an incorrect cmd id", __func__); virtio_error? > + qemu_spin_unlock(&dev->free_page_lock); > + break; > + } > + if (id == dev->free_page_report_cmd_id) { id is LE above, but used in native endian here. > + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; > + } else if (dev->free_page_report_status == > + FREE_PAGE_REPORT_S_START) { > + /* > + * Stop the optimization only when it has started. This avoids > + * a stale stop sign for the previous command. > + */ > + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > + qemu_spin_unlock(&dev->free_page_lock); > + break; > + } And else? What if it does not match and status is not start? Don't you need to skip in elem decoding? > + } > + > + if (elem->in_num) { > + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && > + !dev->poison_val) { poison generally disables everything? Add a TODO to handle it in he future pls. > + qemu_guest_free_page_hint(elem->in_sg[0].iov_base, > + elem->in_sg[0].iov_len); > + } > + virtqueue_push(vq, elem, 0); > + g_free(elem); > + } > + qemu_spin_unlock(&dev->free_page_lock); > + } > + return NULL; > +} > + > +static bool virtio_balloon_free_page_support(void *opaque) > +{ > + VirtIOBalloon *s = opaque; > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + > + return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT); > +} > + > +static void virtio_balloon_free_page_start(void *opaque) > +{ > + VirtIOBalloon *s = opaque; > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + > + if (!runstate_is_running()) { > + return; > + } > + > + if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) { Don't bother with these annotations for non data path things. > + s->free_page_report_cmd_id = > + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; > + } else { > + s->free_page_report_cmd_id++; > + } > + > + s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED; > + virtio_notify_config(vdev); > + qemu_thread_create(&s->free_page_thread, "balloon_fpo", > + virtio_balloon_poll_free_page_hints, s, > + QEMU_THREAD_JOINABLE); > +} > + > +static void virtio_balloon_free_page_stop(void *opaque) > +{ > + VirtIOBalloon *s = opaque; > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + > + qemu_spin_lock(&s->free_page_lock); > + switch (s->free_page_report_status) { > + case FREE_PAGE_REPORT_S_REQUESTED: > + case FREE_PAGE_REPORT_S_START: > + /* > + * The guest hasn't done the reporting, so host sends a notification > + * to the guest to actively stop the reporting before joining the > + * optimization thread. > + */ > + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > + virtio_notify_config(vdev); > + case FREE_PAGE_REPORT_S_STOP: > + /* The guest has stopped the reporting. Join the optimization thread */ > + qemu_thread_join(&s->free_page_thread); > + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; > + case FREE_PAGE_REPORT_S_EXIT: > + /* The optimization thread has gone. No further actions needded. */ > + break; > + } > + qemu_spin_unlock(&s->free_page_lock); > +} > + > static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) > { > VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > @@ -315,6 +437,15 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) > > config.num_pages = cpu_to_le32(dev->num_pages); > config.actual = cpu_to_le32(dev->actual); > + config.poison_val = cpu_to_le32(dev->poison_val); > + > + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { > + config.free_page_report_cmd_id = > + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); > + } else { > + config.free_page_report_cmd_id = > + cpu_to_le32(dev->free_page_report_cmd_id); > + } > > trace_virtio_balloon_get_config(config.num_pages, config.actual); > memcpy(config_data, &config, sizeof(struct virtio_balloon_config)); > @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), > &error_abort); > } > + dev->poison_val = le32_to_cpu(config.poison_val); We probably should not assume this value is correct if guest didn't ack the appropriate feature. > trace_virtio_balloon_set_config(dev->actual, oldactual); > } > > @@ -377,6 +509,11 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, > VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > f |= dev->host_features; > virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); > + > + if (dev->host_features & 1ULL << VIRTIO_BALLOON_F_FREE_PAGE_HINT) { > + virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON); > + } > + > return f; > } > > @@ -413,6 +550,18 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id) > return 0; > } > > +static const VMStateDescription vmstate_virtio_balloon_free_page_report = { > + .name = "virtio-balloon-device/free-page-report", > + .version_id = 1, > + .minimum_version_id = 1, > + .needed = virtio_balloon_free_page_support, > + .fields = (VMStateField[]) { > + VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon), > + VMSTATE_UINT32(poison_val, VirtIOBalloon), > + VMSTATE_END_OF_LIST() > + } > +}; > + > static const VMStateDescription vmstate_virtio_balloon_device = { > .name = "virtio-balloon-device", > .version_id = 1, > @@ -423,30 +572,31 @@ static const VMStateDescription vmstate_virtio_balloon_device = { > VMSTATE_UINT32(actual, VirtIOBalloon), > VMSTATE_END_OF_LIST() > }, > + .subsections = (const VMStateDescription * []) { > + &vmstate_virtio_balloon_free_page_report, > + NULL > + } > }; > > static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) > { > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > VirtIOBalloon *s = VIRTIO_BALLOON(dev); > - int ret; > > virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON, > sizeof(struct virtio_balloon_config)); > > - ret = qemu_add_balloon_handler(virtio_balloon_to_target, > - virtio_balloon_stat, s); > - > - if (ret < 0) { > - error_setg(errp, "Only one balloon device is supported"); > - virtio_cleanup(vdev); > - return; > - } > - > s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); > s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); > s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); > - > + if (virtio_has_feature(s->host_features, > + VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { > + s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL); > + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; > + s->free_page_report_cmd_id = > + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN - 1; > + qemu_spin_init(&s->free_page_lock); > + } > reset_stats(s); > } > > @@ -455,6 +605,9 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > VirtIOBalloon *s = VIRTIO_BALLOON(dev); > > + if (virtio_balloon_free_page_support(s)) { > + virtio_balloon_free_page_stop(s); > + } > balloon_stats_destroy_timer(s); > qemu_remove_balloon_handler(s); > virtio_cleanup(vdev); > @@ -464,6 +617,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev) > { > VirtIOBalloon *s = VIRTIO_BALLOON(vdev); > > + if (virtio_balloon_free_page_support(s)) { > + virtio_balloon_free_page_stop(s); > + } > + > if (s->stats_vq_elem != NULL) { > virtqueue_unpop(s->svq, s->stats_vq_elem, 0); > g_free(s->stats_vq_elem); > @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) > { > VirtIOBalloon *s = VIRTIO_BALLOON(vdev); > > - if (!s->stats_vq_elem && vdev->vm_running && > - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { > - /* poll stats queue for the element we have discarded when the VM > - * was stopped */ > - virtio_balloon_receive_stats(vdev, s->svq); > + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { > + if (!s->stats_vq_elem && vdev->vm_running && > + virtqueue_rewind(s->svq, 1)) { > + /* > + * Poll stats queue for the element we have discarded when the VM > + * was stopped. > + */ > + virtio_balloon_receive_stats(vdev, s->svq); > + } > + > + if (virtio_balloon_free_page_support(s)) { > + qemu_add_balloon_handler(virtio_balloon_to_target, > + virtio_balloon_stat, > + virtio_balloon_free_page_support, > + virtio_balloon_free_page_start, > + virtio_balloon_free_page_stop, > + s); > + /* > + * This handles the case that the guest is being stopped (e.g. by > + * qmp commands) while the driver is still reporting hints. When > + * the guest is woken up, it will continue to report hints, which > + * are not needed. So when the wakeup notifier invokes the > + * set_status callback here, we get the chance to make sure that > + * the free page optimization thread is exited via > + * virtio_balloon_free_page_stop. > + */ > + virtio_balloon_free_page_stop(s); I don't understand the logic here at all. So if status is set to DRIVER_OK, then we stop reporting? > + } else { > + qemu_add_balloon_handler(virtio_balloon_to_target, > + virtio_balloon_stat, NULL, NULL, NULL, s); > + } > } > } > > @@ -509,6 +692,8 @@ static const VMStateDescription vmstate_virtio_balloon = { > static Property virtio_balloon_properties[] = { > DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, > VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), > + DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, > + VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), > DEFINE_PROP_END_OF_LIST(), > }; > > diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h > index 1ea13bd..cfdba37 100644 > --- a/include/hw/virtio/virtio-balloon.h > +++ b/include/hw/virtio/virtio-balloon.h > @@ -23,6 +23,8 @@ > #define VIRTIO_BALLOON(obj) \ > OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON) > > +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000 > + > typedef struct virtio_balloon_stat VirtIOBalloonStat; > > typedef struct virtio_balloon_stat_modern { > @@ -31,15 +33,31 @@ typedef struct virtio_balloon_stat_modern { > uint64_t val; > } VirtIOBalloonStatModern; > > +enum virtio_balloon_free_page_report_status { > + FREE_PAGE_REPORT_S_REQUESTED, > + FREE_PAGE_REPORT_S_START, > + FREE_PAGE_REPORT_S_STOP, > + FREE_PAGE_REPORT_S_EXIT, > +}; > + > typedef struct VirtIOBalloon { > VirtIODevice parent_obj; > - VirtQueue *ivq, *dvq, *svq; > + VirtQueue *ivq, *dvq, *svq, *free_page_vq; > + uint32_t free_page_report_status; > uint32_t num_pages; > uint32_t actual; > + uint32_t free_page_report_cmd_id; > + uint32_t poison_val; > uint64_t stats[VIRTIO_BALLOON_S_NR]; > VirtQueueElement *stats_vq_elem; > size_t stats_vq_offset; > QEMUTimer *stats_timer; > + QemuThread free_page_thread; > + /* > + * Lock to synchronize threads to access the free page reporting related > + * fields (e.g. free_page_report_status). > + */ > + QemuSpin free_page_lock; > int64_t stats_last_update; > int64_t stats_poll_interval; > uint32_t host_features; > diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h > index 7b0a41b..f89e80f 100644 > --- a/include/standard-headers/linux/virtio_balloon.h > +++ b/include/standard-headers/linux/virtio_balloon.h > @@ -34,15 +34,22 @@ > #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ > #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */ > #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ > +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ > +#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ > > /* Size of a PFN in the balloon interface. */ > #define VIRTIO_BALLOON_PFN_SHIFT 12 > > +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0 > struct virtio_balloon_config { > /* Number of pages host wants Guest to give up. */ > uint32_t num_pages; > /* Number of pages we've actually got in balloon. */ > uint32_t actual; > + /* Free page report command id, readonly by guest */ > + uint32_t free_page_report_cmd_id; > + /* Stores PAGE_POISON if page poisoning is in use */ > + uint32_t poison_val; > }; > > #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */ > diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h > index 66543ae..6561a08 100644 > --- a/include/sysemu/balloon.h > +++ b/include/sysemu/balloon.h > @@ -18,11 +18,22 @@ > > typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target); > typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info); > +typedef bool (QEMUBalloonFreePageSupport)(void *opaque); > +typedef void (QEMUBalloonFreePageStart)(void *opaque); > +typedef void (QEMUBalloonFreePageStop)(void *opaque); > > -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, > - QEMUBalloonStatus *stat_func, void *opaque); > void qemu_remove_balloon_handler(void *opaque); > bool qemu_balloon_is_inhibited(void); > void qemu_balloon_inhibit(bool state); > +bool balloon_free_page_support(void); > +void balloon_free_page_start(void); > +void balloon_free_page_stop(void); > + > +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, > + QEMUBalloonStatus *stat_fn, > + QEMUBalloonFreePageSupport *free_page_support_fn, > + QEMUBalloonFreePageStart *free_page_start_fn, > + QEMUBalloonFreePageStop *free_page_stop_fn, > + void *opaque); > > #endif > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 42+ messages in thread
* [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-16 15:16 ` Michael S. Tsirkin 0 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-16 15:16 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > The new feature enables the virtio-balloon device to receive hints of > guest free pages from the free page vq. > > balloon_free_page_start - start guest free page hint reporting. > balloon_free_page_stop - stop guest free page hint reporting. > > Note: balloon will report pages which were free at the time > of this call. As the reporting happens asynchronously, dirty bit logging > must be enabled before this call is made. Guest reporting must be > disabled before the migration dirty bitmap is synchronized. > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > Signed-off-by: Liang Li <liang.z.li@intel.com> > CC: Michael S. Tsirkin <mst@redhat.com> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com> > CC: Juan Quintela <quintela@redhat.com> > --- > balloon.c | 58 +++++-- > hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- > include/hw/virtio/virtio-balloon.h | 20 ++- > include/standard-headers/linux/virtio_balloon.h | 7 + > include/sysemu/balloon.h | 15 +- > 5 files changed, 288 insertions(+), 29 deletions(-) > > diff --git a/balloon.c b/balloon.c > index 6bf0a96..87a0410 100644 > --- a/balloon.c > +++ b/balloon.c > @@ -36,6 +36,9 @@ > > static QEMUBalloonEvent *balloon_event_fn; > static QEMUBalloonStatus *balloon_stat_fn; > +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn; > +static QEMUBalloonFreePageStart *balloon_free_page_start_fn; > +static QEMUBalloonFreePageStop *balloon_free_page_stop_fn; > static void *balloon_opaque; > static bool balloon_inhibited; > > @@ -64,19 +67,51 @@ static bool have_balloon(Error **errp) > return true; > } > > -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, > - QEMUBalloonStatus *stat_func, void *opaque) > +bool balloon_free_page_support(void) > { > - if (balloon_event_fn || balloon_stat_fn || balloon_opaque) { > - /* We're already registered one balloon handler. How many can > - * a guest really have? > - */ > - return -1; > + return balloon_free_page_support_fn && > + balloon_free_page_support_fn(balloon_opaque); > +} > + > +/* > + * Balloon will report pages which were free at the time of this call. As the > + * reporting happens asynchronously, dirty bit logging must be enabled before > + * this call is made. > + */ > +void balloon_free_page_start(void) > +{ > + balloon_free_page_start_fn(balloon_opaque); > +} > + > +/* > + * Guest reporting must be disabled before the migration dirty bitmap is > + * synchronized. > + */ > +void balloon_free_page_stop(void) > +{ > + balloon_free_page_stop_fn(balloon_opaque); > +} > + > +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, > + QEMUBalloonStatus *stat_fn, > + QEMUBalloonFreePageSupport *free_page_support_fn, > + QEMUBalloonFreePageStart *free_page_start_fn, > + QEMUBalloonFreePageStop *free_page_stop_fn, > + void *opaque) > +{ > + if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn || > + balloon_free_page_start_fn || balloon_free_page_stop_fn || > + balloon_opaque) { > + /* We already registered one balloon handler. */ > + return; > } > - balloon_event_fn = event_func; > - balloon_stat_fn = stat_func; > + > + balloon_event_fn = event_fn; > + balloon_stat_fn = stat_fn; > + balloon_free_page_support_fn = free_page_support_fn; > + balloon_free_page_start_fn = free_page_start_fn; > + balloon_free_page_stop_fn = free_page_stop_fn; > balloon_opaque = opaque; > - return 0; > } > > void qemu_remove_balloon_handler(void *opaque) > @@ -86,6 +121,9 @@ void qemu_remove_balloon_handler(void *opaque) > } > balloon_event_fn = NULL; > balloon_stat_fn = NULL; > + balloon_free_page_support_fn = NULL; > + balloon_free_page_start_fn = NULL; > + balloon_free_page_stop_fn = NULL; > balloon_opaque = NULL; > } > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > index f456cea..30c7504 100644 > --- a/hw/virtio/virtio-balloon.c > +++ b/hw/virtio/virtio-balloon.c > @@ -31,6 +31,7 @@ > > #include "hw/virtio/virtio-bus.h" > #include "hw/virtio/virtio-access.h" > +#include "migration/misc.h" > > #define BALLOON_PAGE_SIZE (1 << VIRTIO_BALLOON_PFN_SHIFT) > > @@ -308,6 +309,127 @@ out: > } > } > > +static void *virtio_balloon_poll_free_page_hints(void *opaque) > +{ > + VirtQueueElement *elem; > + VirtIOBalloon *dev = opaque; > + VirtQueue *vq = dev->free_page_vq; > + uint32_t id; > + size_t size; > + > + /* The optimization thread runs only when the guest is running. */ > + while (runstate_is_running()) { Note that this check is not guaranteed to be correct when checked like this outside BQL. I think you are better off relying on status callback to synchronise with the backend thread. > + qemu_spin_lock(&dev->free_page_lock); > + /* > + * If the migration thread actively stops the reporting, exit > + * immediately. > + */ > + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { > + qemu_spin_unlock(&dev->free_page_lock); > + break; > + } > + > + elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); > + if (!elem) { > + qemu_spin_unlock(&dev->free_page_lock); > + continue; > + } > + > + if (elem->out_num) { > + size = iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(id)); > + virtqueue_push(vq, elem, size); > + g_free(elem); > + if (unlikely(size != sizeof(id))) { > + warn_report("%s: received an incorrect cmd id", __func__); virtio_error? > + qemu_spin_unlock(&dev->free_page_lock); > + break; > + } > + if (id == dev->free_page_report_cmd_id) { id is LE above, but used in native endian here. > + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; > + } else if (dev->free_page_report_status == > + FREE_PAGE_REPORT_S_START) { > + /* > + * Stop the optimization only when it has started. This avoids > + * a stale stop sign for the previous command. > + */ > + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > + qemu_spin_unlock(&dev->free_page_lock); > + break; > + } And else? What if it does not match and status is not start? Don't you need to skip in elem decoding? > + } > + > + if (elem->in_num) { > + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && > + !dev->poison_val) { poison generally disables everything? Add a TODO to handle it in he future pls. > + qemu_guest_free_page_hint(elem->in_sg[0].iov_base, > + elem->in_sg[0].iov_len); > + } > + virtqueue_push(vq, elem, 0); > + g_free(elem); > + } > + qemu_spin_unlock(&dev->free_page_lock); > + } > + return NULL; > +} > + > +static bool virtio_balloon_free_page_support(void *opaque) > +{ > + VirtIOBalloon *s = opaque; > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + > + return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT); > +} > + > +static void virtio_balloon_free_page_start(void *opaque) > +{ > + VirtIOBalloon *s = opaque; > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + > + if (!runstate_is_running()) { > + return; > + } > + > + if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) { Don't bother with these annotations for non data path things. > + s->free_page_report_cmd_id = > + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; > + } else { > + s->free_page_report_cmd_id++; > + } > + > + s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED; > + virtio_notify_config(vdev); > + qemu_thread_create(&s->free_page_thread, "balloon_fpo", > + virtio_balloon_poll_free_page_hints, s, > + QEMU_THREAD_JOINABLE); > +} > + > +static void virtio_balloon_free_page_stop(void *opaque) > +{ > + VirtIOBalloon *s = opaque; > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + > + qemu_spin_lock(&s->free_page_lock); > + switch (s->free_page_report_status) { > + case FREE_PAGE_REPORT_S_REQUESTED: > + case FREE_PAGE_REPORT_S_START: > + /* > + * The guest hasn't done the reporting, so host sends a notification > + * to the guest to actively stop the reporting before joining the > + * optimization thread. > + */ > + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > + virtio_notify_config(vdev); > + case FREE_PAGE_REPORT_S_STOP: > + /* The guest has stopped the reporting. Join the optimization thread */ > + qemu_thread_join(&s->free_page_thread); > + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; > + case FREE_PAGE_REPORT_S_EXIT: > + /* The optimization thread has gone. No further actions needded. */ > + break; > + } > + qemu_spin_unlock(&s->free_page_lock); > +} > + > static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) > { > VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > @@ -315,6 +437,15 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) > > config.num_pages = cpu_to_le32(dev->num_pages); > config.actual = cpu_to_le32(dev->actual); > + config.poison_val = cpu_to_le32(dev->poison_val); > + > + if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { > + config.free_page_report_cmd_id = > + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); > + } else { > + config.free_page_report_cmd_id = > + cpu_to_le32(dev->free_page_report_cmd_id); > + } > > trace_virtio_balloon_get_config(config.num_pages, config.actual); > memcpy(config_data, &config, sizeof(struct virtio_balloon_config)); > @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), > &error_abort); > } > + dev->poison_val = le32_to_cpu(config.poison_val); We probably should not assume this value is correct if guest didn't ack the appropriate feature. > trace_virtio_balloon_set_config(dev->actual, oldactual); > } > > @@ -377,6 +509,11 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, > VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > f |= dev->host_features; > virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); > + > + if (dev->host_features & 1ULL << VIRTIO_BALLOON_F_FREE_PAGE_HINT) { > + virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON); > + } > + > return f; > } > > @@ -413,6 +550,18 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id) > return 0; > } > > +static const VMStateDescription vmstate_virtio_balloon_free_page_report = { > + .name = "virtio-balloon-device/free-page-report", > + .version_id = 1, > + .minimum_version_id = 1, > + .needed = virtio_balloon_free_page_support, > + .fields = (VMStateField[]) { > + VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon), > + VMSTATE_UINT32(poison_val, VirtIOBalloon), > + VMSTATE_END_OF_LIST() > + } > +}; > + > static const VMStateDescription vmstate_virtio_balloon_device = { > .name = "virtio-balloon-device", > .version_id = 1, > @@ -423,30 +572,31 @@ static const VMStateDescription vmstate_virtio_balloon_device = { > VMSTATE_UINT32(actual, VirtIOBalloon), > VMSTATE_END_OF_LIST() > }, > + .subsections = (const VMStateDescription * []) { > + &vmstate_virtio_balloon_free_page_report, > + NULL > + } > }; > > static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) > { > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > VirtIOBalloon *s = VIRTIO_BALLOON(dev); > - int ret; > > virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON, > sizeof(struct virtio_balloon_config)); > > - ret = qemu_add_balloon_handler(virtio_balloon_to_target, > - virtio_balloon_stat, s); > - > - if (ret < 0) { > - error_setg(errp, "Only one balloon device is supported"); > - virtio_cleanup(vdev); > - return; > - } > - > s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); > s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); > s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); > - > + if (virtio_has_feature(s->host_features, > + VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { > + s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL); > + s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; > + s->free_page_report_cmd_id = > + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN - 1; > + qemu_spin_init(&s->free_page_lock); > + } > reset_stats(s); > } > > @@ -455,6 +605,9 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > VirtIOBalloon *s = VIRTIO_BALLOON(dev); > > + if (virtio_balloon_free_page_support(s)) { > + virtio_balloon_free_page_stop(s); > + } > balloon_stats_destroy_timer(s); > qemu_remove_balloon_handler(s); > virtio_cleanup(vdev); > @@ -464,6 +617,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev) > { > VirtIOBalloon *s = VIRTIO_BALLOON(vdev); > > + if (virtio_balloon_free_page_support(s)) { > + virtio_balloon_free_page_stop(s); > + } > + > if (s->stats_vq_elem != NULL) { > virtqueue_unpop(s->svq, s->stats_vq_elem, 0); > g_free(s->stats_vq_elem); > @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) > { > VirtIOBalloon *s = VIRTIO_BALLOON(vdev); > > - if (!s->stats_vq_elem && vdev->vm_running && > - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { > - /* poll stats queue for the element we have discarded when the VM > - * was stopped */ > - virtio_balloon_receive_stats(vdev, s->svq); > + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { > + if (!s->stats_vq_elem && vdev->vm_running && > + virtqueue_rewind(s->svq, 1)) { > + /* > + * Poll stats queue for the element we have discarded when the VM > + * was stopped. > + */ > + virtio_balloon_receive_stats(vdev, s->svq); > + } > + > + if (virtio_balloon_free_page_support(s)) { > + qemu_add_balloon_handler(virtio_balloon_to_target, > + virtio_balloon_stat, > + virtio_balloon_free_page_support, > + virtio_balloon_free_page_start, > + virtio_balloon_free_page_stop, > + s); > + /* > + * This handles the case that the guest is being stopped (e.g. by > + * qmp commands) while the driver is still reporting hints. When > + * the guest is woken up, it will continue to report hints, which > + * are not needed. So when the wakeup notifier invokes the > + * set_status callback here, we get the chance to make sure that > + * the free page optimization thread is exited via > + * virtio_balloon_free_page_stop. > + */ > + virtio_balloon_free_page_stop(s); I don't understand the logic here at all. So if status is set to DRIVER_OK, then we stop reporting? > + } else { > + qemu_add_balloon_handler(virtio_balloon_to_target, > + virtio_balloon_stat, NULL, NULL, NULL, s); > + } > } > } > > @@ -509,6 +692,8 @@ static const VMStateDescription vmstate_virtio_balloon = { > static Property virtio_balloon_properties[] = { > DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, > VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), > + DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, > + VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), > DEFINE_PROP_END_OF_LIST(), > }; > > diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h > index 1ea13bd..cfdba37 100644 > --- a/include/hw/virtio/virtio-balloon.h > +++ b/include/hw/virtio/virtio-balloon.h > @@ -23,6 +23,8 @@ > #define VIRTIO_BALLOON(obj) \ > OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON) > > +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000 > + > typedef struct virtio_balloon_stat VirtIOBalloonStat; > > typedef struct virtio_balloon_stat_modern { > @@ -31,15 +33,31 @@ typedef struct virtio_balloon_stat_modern { > uint64_t val; > } VirtIOBalloonStatModern; > > +enum virtio_balloon_free_page_report_status { > + FREE_PAGE_REPORT_S_REQUESTED, > + FREE_PAGE_REPORT_S_START, > + FREE_PAGE_REPORT_S_STOP, > + FREE_PAGE_REPORT_S_EXIT, > +}; > + > typedef struct VirtIOBalloon { > VirtIODevice parent_obj; > - VirtQueue *ivq, *dvq, *svq; > + VirtQueue *ivq, *dvq, *svq, *free_page_vq; > + uint32_t free_page_report_status; > uint32_t num_pages; > uint32_t actual; > + uint32_t free_page_report_cmd_id; > + uint32_t poison_val; > uint64_t stats[VIRTIO_BALLOON_S_NR]; > VirtQueueElement *stats_vq_elem; > size_t stats_vq_offset; > QEMUTimer *stats_timer; > + QemuThread free_page_thread; > + /* > + * Lock to synchronize threads to access the free page reporting related > + * fields (e.g. free_page_report_status). > + */ > + QemuSpin free_page_lock; > int64_t stats_last_update; > int64_t stats_poll_interval; > uint32_t host_features; > diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h > index 7b0a41b..f89e80f 100644 > --- a/include/standard-headers/linux/virtio_balloon.h > +++ b/include/standard-headers/linux/virtio_balloon.h > @@ -34,15 +34,22 @@ > #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ > #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */ > #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ > +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ > +#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ > > /* Size of a PFN in the balloon interface. */ > #define VIRTIO_BALLOON_PFN_SHIFT 12 > > +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0 > struct virtio_balloon_config { > /* Number of pages host wants Guest to give up. */ > uint32_t num_pages; > /* Number of pages we've actually got in balloon. */ > uint32_t actual; > + /* Free page report command id, readonly by guest */ > + uint32_t free_page_report_cmd_id; > + /* Stores PAGE_POISON if page poisoning is in use */ > + uint32_t poison_val; > }; > > #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */ > diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h > index 66543ae..6561a08 100644 > --- a/include/sysemu/balloon.h > +++ b/include/sysemu/balloon.h > @@ -18,11 +18,22 @@ > > typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target); > typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info); > +typedef bool (QEMUBalloonFreePageSupport)(void *opaque); > +typedef void (QEMUBalloonFreePageStart)(void *opaque); > +typedef void (QEMUBalloonFreePageStop)(void *opaque); > > -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func, > - QEMUBalloonStatus *stat_func, void *opaque); > void qemu_remove_balloon_handler(void *opaque); > bool qemu_balloon_is_inhibited(void); > void qemu_balloon_inhibit(bool state); > +bool balloon_free_page_support(void); > +void balloon_free_page_start(void); > +void balloon_free_page_stop(void); > + > +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn, > + QEMUBalloonStatus *stat_fn, > + QEMUBalloonFreePageSupport *free_page_support_fn, > + QEMUBalloonFreePageStart *free_page_start_fn, > + QEMUBalloonFreePageStop *free_page_stop_fn, > + void *opaque); > > #endif > -- > 1.8.3.1 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-16 15:16 ` [virtio-dev] " Michael S. Tsirkin @ 2018-03-18 10:36 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-18 10:36 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >> The new feature enables the virtio-balloon device to receive hints of >> guest free pages from the free page vq. >> >> balloon_free_page_start - start guest free page hint reporting. >> balloon_free_page_stop - stop guest free page hint reporting. >> >> Note: balloon will report pages which were free at the time >> of this call. As the reporting happens asynchronously, dirty bit logging >> must be enabled before this call is made. Guest reporting must be >> disabled before the migration dirty bitmap is synchronized. >> >> Signed-off-by: Wei Wang <wei.w.wang@intel.com> >> Signed-off-by: Liang Li <liang.z.li@intel.com> >> CC: Michael S. Tsirkin <mst@redhat.com> >> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> >> CC: Juan Quintela <quintela@redhat.com> >> --- >> balloon.c | 58 +++++-- >> hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- >> include/hw/virtio/virtio-balloon.h | 20 ++- >> include/standard-headers/linux/virtio_balloon.h | 7 + >> include/sysemu/balloon.h | 15 +- >> 5 files changed, 288 insertions(+), 29 deletions(-) >> >> diff --git a/balloon.c b/balloon.c >> index 6bf0a96..87a0410 100644 >> --- a/balloon.c >> +++ b/balloon.c >> @@ -36,6 +36,9 @@ >> >> >> +static void *virtio_balloon_poll_free_page_hints(void *opaque) >> +{ >> + VirtQueueElement *elem; >> + VirtIOBalloon *dev = opaque; >> + VirtQueue *vq = dev->free_page_vq; >> + uint32_t id; >> + size_t size; >> + >> + /* The optimization thread runs only when the guest is running. */ >> + while (runstate_is_running()) { > Note that this check is not guaranteed to be correct > when checked like this outside BQL. > > I think you are better off relying on status > callback to synchronise with the backend thread. > It's actually OK here, I think we don't need the guarantee. The device is just the consumer of the vq, essentially it doesn't have a dependency (i.e. won't block or cause errors) on the guest state. For example: 1) when the device executes "while (runstate_is_running())" and finds that the guest is running, it proceeds; 2) the guest is stopped immediately right after the "while (runstate_is_running())" check; 3) the device side execution reaches virtqueue_pop(), and finds "elem==NULL", since the driver (provider) stops adding hints. Then it continues by going back to "while (runstate_is_running())", and now it finds the guest is stopped, and then exits. Essentially, this runstate check is just an optimization to the case that the driver is stopped to provide hints while the device side optimization thread is still polling the empty vq (i.e. effort in vain). Probably, it would be better to check the runstate under "elem==NULL": while (1) { ... elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); if (!elem) { qemu_spin_unlock(&dev->free_page_lock); if (runstate_is_running()) continue; else break; } ... } >> + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; >> + } else if (dev->free_page_report_status == >> + FREE_PAGE_REPORT_S_START) { >> + /* >> + * Stop the optimization only when it has started. This avoids >> + * a stale stop sign for the previous command. >> + */ >> + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; >> + qemu_spin_unlock(&dev->free_page_lock); >> + break; >> + } > And else? What if it does not match and status is not start? > Don't you need to skip in elem decoding? No, actually we don't need "else". Please see the code inside if (elem->in_num) below. If the status isn't set to START, qemu_guest_free_page_hint will not be called to decode the elem. > >> + } >> + >> + if (elem->in_num) { >> + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && >> + !dev->poison_val) { > poison generally disables everything? Add a TODO to handle > it in he future pls. OK, will add TODO in the commit. @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), &error_abort); } + dev->poison_val = le32_to_cpu(config.poison_val); > We probably should not assume this value is correct if guest didn't > ack the appropriate feature. OK, I'll add a comment to explain that. @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); - if (!s->stats_vq_elem && vdev->vm_running && - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { - /* poll stats queue for the element we have discarded when the VM - * was stopped */ - virtio_balloon_receive_stats(vdev, s->svq); + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + if (!s->stats_vq_elem && vdev->vm_running && + virtqueue_rewind(s->svq, 1)) { + /* + * Poll stats queue for the element we have discarded when the VM + * was stopped. + */ + virtio_balloon_receive_stats(vdev, s->svq); + } + + if (virtio_balloon_free_page_support(s)) { + qemu_add_balloon_handler(virtio_balloon_to_target, + virtio_balloon_stat, + virtio_balloon_free_page_support, + virtio_balloon_free_page_start, + virtio_balloon_free_page_stop, + s); + /* + * This handles the case that the guest is being stopped (e.g. by + * qmp commands) while the driver is still reporting hints. When + * the guest is woken up, it will continue to report hints, which + * are not needed. So when the wakeup notifier invokes the + * set_status callback here, we get the chance to make sure that + * the free page optimization thread is exited via + * virtio_balloon_free_page_stop. + */ + virtio_balloon_free_page_stop(s); > I don't understand the logic here at all. > So if status is set to DRIVER_OK, then we stop reporting? > I thought about this more. It's not necessary to call the stop() function here, because we have already made the optimization thread exit when the guest is stopped. When the guest is woken up, it will continue to report, and there is no consumer of the vq (the optimization thread has exited). There is no functional issue here, since the driver will just drop the hint when the vq is full. From the optimization point of view, I think we can consider to replace the above sop() function with virtio_notify_config(vdev), which will stop the guest's unnecessary reporting. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-18 10:36 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-18 10:36 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >> The new feature enables the virtio-balloon device to receive hints of >> guest free pages from the free page vq. >> >> balloon_free_page_start - start guest free page hint reporting. >> balloon_free_page_stop - stop guest free page hint reporting. >> >> Note: balloon will report pages which were free at the time >> of this call. As the reporting happens asynchronously, dirty bit logging >> must be enabled before this call is made. Guest reporting must be >> disabled before the migration dirty bitmap is synchronized. >> >> Signed-off-by: Wei Wang <wei.w.wang@intel.com> >> Signed-off-by: Liang Li <liang.z.li@intel.com> >> CC: Michael S. Tsirkin <mst@redhat.com> >> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> >> CC: Juan Quintela <quintela@redhat.com> >> --- >> balloon.c | 58 +++++-- >> hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- >> include/hw/virtio/virtio-balloon.h | 20 ++- >> include/standard-headers/linux/virtio_balloon.h | 7 + >> include/sysemu/balloon.h | 15 +- >> 5 files changed, 288 insertions(+), 29 deletions(-) >> >> diff --git a/balloon.c b/balloon.c >> index 6bf0a96..87a0410 100644 >> --- a/balloon.c >> +++ b/balloon.c >> @@ -36,6 +36,9 @@ >> >> >> +static void *virtio_balloon_poll_free_page_hints(void *opaque) >> +{ >> + VirtQueueElement *elem; >> + VirtIOBalloon *dev = opaque; >> + VirtQueue *vq = dev->free_page_vq; >> + uint32_t id; >> + size_t size; >> + >> + /* The optimization thread runs only when the guest is running. */ >> + while (runstate_is_running()) { > Note that this check is not guaranteed to be correct > when checked like this outside BQL. > > I think you are better off relying on status > callback to synchronise with the backend thread. > It's actually OK here, I think we don't need the guarantee. The device is just the consumer of the vq, essentially it doesn't have a dependency (i.e. won't block or cause errors) on the guest state. For example: 1) when the device executes "while (runstate_is_running())" and finds that the guest is running, it proceeds; 2) the guest is stopped immediately right after the "while (runstate_is_running())" check; 3) the device side execution reaches virtqueue_pop(), and finds "elem==NULL", since the driver (provider) stops adding hints. Then it continues by going back to "while (runstate_is_running())", and now it finds the guest is stopped, and then exits. Essentially, this runstate check is just an optimization to the case that the driver is stopped to provide hints while the device side optimization thread is still polling the empty vq (i.e. effort in vain). Probably, it would be better to check the runstate under "elem==NULL": while (1) { ... elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); if (!elem) { qemu_spin_unlock(&dev->free_page_lock); if (runstate_is_running()) continue; else break; } ... } >> + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; >> + } else if (dev->free_page_report_status == >> + FREE_PAGE_REPORT_S_START) { >> + /* >> + * Stop the optimization only when it has started. This avoids >> + * a stale stop sign for the previous command. >> + */ >> + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; >> + qemu_spin_unlock(&dev->free_page_lock); >> + break; >> + } > And else? What if it does not match and status is not start? > Don't you need to skip in elem decoding? No, actually we don't need "else". Please see the code inside if (elem->in_num) below. If the status isn't set to START, qemu_guest_free_page_hint will not be called to decode the elem. > >> + } >> + >> + if (elem->in_num) { >> + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && >> + !dev->poison_val) { > poison generally disables everything? Add a TODO to handle > it in he future pls. OK, will add TODO in the commit. @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), &error_abort); } + dev->poison_val = le32_to_cpu(config.poison_val); > We probably should not assume this value is correct if guest didn't > ack the appropriate feature. OK, I'll add a comment to explain that. @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); - if (!s->stats_vq_elem && vdev->vm_running && - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { - /* poll stats queue for the element we have discarded when the VM - * was stopped */ - virtio_balloon_receive_stats(vdev, s->svq); + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + if (!s->stats_vq_elem && vdev->vm_running && + virtqueue_rewind(s->svq, 1)) { + /* + * Poll stats queue for the element we have discarded when the VM + * was stopped. + */ + virtio_balloon_receive_stats(vdev, s->svq); + } + + if (virtio_balloon_free_page_support(s)) { + qemu_add_balloon_handler(virtio_balloon_to_target, + virtio_balloon_stat, + virtio_balloon_free_page_support, + virtio_balloon_free_page_start, + virtio_balloon_free_page_stop, + s); + /* + * This handles the case that the guest is being stopped (e.g. by + * qmp commands) while the driver is still reporting hints. When + * the guest is woken up, it will continue to report hints, which + * are not needed. So when the wakeup notifier invokes the + * set_status callback here, we get the chance to make sure that + * the free page optimization thread is exited via + * virtio_balloon_free_page_stop. + */ + virtio_balloon_free_page_stop(s); > I don't understand the logic here at all. > So if status is set to DRIVER_OK, then we stop reporting? > I thought about this more. It's not necessary to call the stop() function here, because we have already made the optimization thread exit when the guest is stopped. When the guest is woken up, it will continue to report, and there is no consumer of the vq (the optimization thread has exited). There is no functional issue here, since the driver will just drop the hint when the vq is full. From the optimization point of view, I think we can consider to replace the above sop() function with virtio_notify_config(vdev), which will stop the guest's unnecessary reporting. Best, Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-18 10:36 ` [virtio-dev] " Wei Wang @ 2018-03-19 4:24 ` Michael S. Tsirkin -1 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-19 4:24 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > The new feature enables the virtio-balloon device to receive hints of > > > guest free pages from the free page vq. > > > > > > balloon_free_page_start - start guest free page hint reporting. > > > balloon_free_page_stop - stop guest free page hint reporting. > > > > > > Note: balloon will report pages which were free at the time > > > of this call. As the reporting happens asynchronously, dirty bit logging > > > must be enabled before this call is made. Guest reporting must be > > > disabled before the migration dirty bitmap is synchronized. > > > > > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > > > Signed-off-by: Liang Li <liang.z.li@intel.com> > > > CC: Michael S. Tsirkin <mst@redhat.com> > > > CC: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > CC: Juan Quintela <quintela@redhat.com> > > > --- > > > balloon.c | 58 +++++-- > > > hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- > > > include/hw/virtio/virtio-balloon.h | 20 ++- > > > include/standard-headers/linux/virtio_balloon.h | 7 + > > > include/sysemu/balloon.h | 15 +- > > > 5 files changed, 288 insertions(+), 29 deletions(-) > > > > > > diff --git a/balloon.c b/balloon.c > > > index 6bf0a96..87a0410 100644 > > > --- a/balloon.c > > > +++ b/balloon.c > > > @@ -36,6 +36,9 @@ > > > > > > +static void *virtio_balloon_poll_free_page_hints(void *opaque) > > > +{ > > > + VirtQueueElement *elem; > > > + VirtIOBalloon *dev = opaque; > > > + VirtQueue *vq = dev->free_page_vq; > > > + uint32_t id; > > > + size_t size; > > > + > > > + /* The optimization thread runs only when the guest is running. */ > > > + while (runstate_is_running()) { > > Note that this check is not guaranteed to be correct > > when checked like this outside BQL. > > > > I think you are better off relying on status > > callback to synchronise with the backend thread. > > > > It's actually OK here, I think we don't need the guarantee. The device is > just the consumer of the vq, essentially it doesn't have a dependency (i.e. > won't block or cause errors) on the guest state. > For example: > 1) when the device executes "while (runstate_is_running())" and finds that > the guest is running, it proceeds; > 2) the guest is stopped immediately right after the "while > (runstate_is_running())" check; > 3) the device side execution reaches virtqueue_pop(), and finds > "elem==NULL", since the driver (provider) stops adding hints. Then it > continues by going back to "while (runstate_is_running())", and now it finds > the guest is stopped, and then exits. OTOH it seems that if thread stops nothing will wake it up whem vm is restarted. Such bahaviour change across vmstop/vmstart is unexpected. > Essentially, this runstate check is just an optimization to the case that > the driver is stopped to provide hints while the device side optimization > thread is still polling the empty vq (i.e. effort in vain). Probably, it > would be better to check the runstate under "elem==NULL": > > while (1) { > ... > elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); > if (!elem) { > qemu_spin_unlock(&dev->free_page_lock); > if (runstate_is_running()) > continue; > else > break; > } > ... > } > > > > > + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; > > > + } else if (dev->free_page_report_status == > > > + FREE_PAGE_REPORT_S_START) { > > > + /* > > > + * Stop the optimization only when it has started. This avoids > > > + * a stale stop sign for the previous command. > > > + */ > > > + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > > > + qemu_spin_unlock(&dev->free_page_lock); > > > + break; > > > + } > > And else? What if it does not match and status is not start? > > Don't you need to skip in elem decoding? > > No, actually we don't need "else". Please see the code inside if > (elem->in_num) below. If the status isn't set to START, > qemu_guest_free_page_hint will not be called to decode the elem. > > > > > > > + } > > > + > > > + if (elem->in_num) { > > > + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && > > > + !dev->poison_val) { > > poison generally disables everything? Add a TODO to handle > > it in he future pls. > > > OK, will add TODO in the commit. > > > > @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), > &error_abort); > } > + dev->poison_val = le32_to_cpu(config.poison_val); > > > We probably should not assume this value is correct if guest didn't > > ack the appropriate feature. > > > OK, I'll add a comment to explain that. > > > @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) > { > VirtIOBalloon *s = VIRTIO_BALLOON(vdev); > - if (!s->stats_vq_elem && vdev->vm_running && > - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { > - /* poll stats queue for the element we have discarded when the VM > - * was stopped */ > - virtio_balloon_receive_stats(vdev, s->svq); > + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { > + if (!s->stats_vq_elem && vdev->vm_running && > + virtqueue_rewind(s->svq, 1)) { > + /* > + * Poll stats queue for the element we have discarded when the VM > + * was stopped. > + */ > + virtio_balloon_receive_stats(vdev, s->svq); > + } > + > + if (virtio_balloon_free_page_support(s)) { > + qemu_add_balloon_handler(virtio_balloon_to_target, > + virtio_balloon_stat, > + virtio_balloon_free_page_support, > + virtio_balloon_free_page_start, > + virtio_balloon_free_page_stop, > + s); > + /* > + * This handles the case that the guest is being stopped (e.g. by > + * qmp commands) while the driver is still reporting hints. When > + * the guest is woken up, it will continue to report hints, which > + * are not needed. So when the wakeup notifier invokes the > + * set_status callback here, we get the chance to make sure that > + * the free page optimization thread is exited via > + * virtio_balloon_free_page_stop. > + */ > + virtio_balloon_free_page_stop(s); > > > I don't understand the logic here at all. > > So if status is set to DRIVER_OK, then we stop reporting? > > > > I thought about this more. It's not necessary to call the stop() function > here, because we have already made the optimization thread exit when the > guest is stopped. When the guest is woken up, it will continue to report, > and there is no consumer of the vq (the optimization thread has exited). > There is no functional issue here, since the driver will just drop the hint > when the vq is full. From the optimization point of view, I think we can > consider to replace the above sop() function with > virtio_notify_config(vdev), which will stop the guest's unnecessary > reporting. I do not understand why we want to increment the counter on vm stop though. It does make sense to stop the thread but why not resume where we left off when vm is resumed? > > Best, > Wei In short the code will be easier to reason about if there's some symmetry in how we start/stop things. If we need the thread to run together with the VCPU then starting/stopping it from status seems like a reasonable choice. -- MST ^ permalink raw reply [flat|nested] 42+ messages in thread
* [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-19 4:24 ` Michael S. Tsirkin 0 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-19 4:24 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > The new feature enables the virtio-balloon device to receive hints of > > > guest free pages from the free page vq. > > > > > > balloon_free_page_start - start guest free page hint reporting. > > > balloon_free_page_stop - stop guest free page hint reporting. > > > > > > Note: balloon will report pages which were free at the time > > > of this call. As the reporting happens asynchronously, dirty bit logging > > > must be enabled before this call is made. Guest reporting must be > > > disabled before the migration dirty bitmap is synchronized. > > > > > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > > > Signed-off-by: Liang Li <liang.z.li@intel.com> > > > CC: Michael S. Tsirkin <mst@redhat.com> > > > CC: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > CC: Juan Quintela <quintela@redhat.com> > > > --- > > > balloon.c | 58 +++++-- > > > hw/virtio/virtio-balloon.c | 217 ++++++++++++++++++++++-- > > > include/hw/virtio/virtio-balloon.h | 20 ++- > > > include/standard-headers/linux/virtio_balloon.h | 7 + > > > include/sysemu/balloon.h | 15 +- > > > 5 files changed, 288 insertions(+), 29 deletions(-) > > > > > > diff --git a/balloon.c b/balloon.c > > > index 6bf0a96..87a0410 100644 > > > --- a/balloon.c > > > +++ b/balloon.c > > > @@ -36,6 +36,9 @@ > > > > > > +static void *virtio_balloon_poll_free_page_hints(void *opaque) > > > +{ > > > + VirtQueueElement *elem; > > > + VirtIOBalloon *dev = opaque; > > > + VirtQueue *vq = dev->free_page_vq; > > > + uint32_t id; > > > + size_t size; > > > + > > > + /* The optimization thread runs only when the guest is running. */ > > > + while (runstate_is_running()) { > > Note that this check is not guaranteed to be correct > > when checked like this outside BQL. > > > > I think you are better off relying on status > > callback to synchronise with the backend thread. > > > > It's actually OK here, I think we don't need the guarantee. The device is > just the consumer of the vq, essentially it doesn't have a dependency (i.e. > won't block or cause errors) on the guest state. > For example: > 1) when the device executes "while (runstate_is_running())" and finds that > the guest is running, it proceeds; > 2) the guest is stopped immediately right after the "while > (runstate_is_running())" check; > 3) the device side execution reaches virtqueue_pop(), and finds > "elem==NULL", since the driver (provider) stops adding hints. Then it > continues by going back to "while (runstate_is_running())", and now it finds > the guest is stopped, and then exits. OTOH it seems that if thread stops nothing will wake it up whem vm is restarted. Such bahaviour change across vmstop/vmstart is unexpected. > Essentially, this runstate check is just an optimization to the case that > the driver is stopped to provide hints while the device side optimization > thread is still polling the empty vq (i.e. effort in vain). Probably, it > would be better to check the runstate under "elem==NULL": > > while (1) { > ... > elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); > if (!elem) { > qemu_spin_unlock(&dev->free_page_lock); > if (runstate_is_running()) > continue; > else > break; > } > ... > } > > > > > + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; > > > + } else if (dev->free_page_report_status == > > > + FREE_PAGE_REPORT_S_START) { > > > + /* > > > + * Stop the optimization only when it has started. This avoids > > > + * a stale stop sign for the previous command. > > > + */ > > > + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > > > + qemu_spin_unlock(&dev->free_page_lock); > > > + break; > > > + } > > And else? What if it does not match and status is not start? > > Don't you need to skip in elem decoding? > > No, actually we don't need "else". Please see the code inside if > (elem->in_num) below. If the status isn't set to START, > qemu_guest_free_page_hint will not be called to decode the elem. > > > > > > > + } > > > + > > > + if (elem->in_num) { > > > + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START && > > > + !dev->poison_val) { > > poison generally disables everything? Add a TODO to handle > > it in he future pls. > > > OK, will add TODO in the commit. > > > > @@ -368,6 +499,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT), > &error_abort); > } > + dev->poison_val = le32_to_cpu(config.poison_val); > > > We probably should not assume this value is correct if guest didn't > > ack the appropriate feature. > > > OK, I'll add a comment to explain that. > > > @@ -475,11 +632,37 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) > { > VirtIOBalloon *s = VIRTIO_BALLOON(vdev); > - if (!s->stats_vq_elem && vdev->vm_running && > - (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) { > - /* poll stats queue for the element we have discarded when the VM > - * was stopped */ > - virtio_balloon_receive_stats(vdev, s->svq); > + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { > + if (!s->stats_vq_elem && vdev->vm_running && > + virtqueue_rewind(s->svq, 1)) { > + /* > + * Poll stats queue for the element we have discarded when the VM > + * was stopped. > + */ > + virtio_balloon_receive_stats(vdev, s->svq); > + } > + > + if (virtio_balloon_free_page_support(s)) { > + qemu_add_balloon_handler(virtio_balloon_to_target, > + virtio_balloon_stat, > + virtio_balloon_free_page_support, > + virtio_balloon_free_page_start, > + virtio_balloon_free_page_stop, > + s); > + /* > + * This handles the case that the guest is being stopped (e.g. by > + * qmp commands) while the driver is still reporting hints. When > + * the guest is woken up, it will continue to report hints, which > + * are not needed. So when the wakeup notifier invokes the > + * set_status callback here, we get the chance to make sure that > + * the free page optimization thread is exited via > + * virtio_balloon_free_page_stop. > + */ > + virtio_balloon_free_page_stop(s); > > > I don't understand the logic here at all. > > So if status is set to DRIVER_OK, then we stop reporting? > > > > I thought about this more. It's not necessary to call the stop() function > here, because we have already made the optimization thread exit when the > guest is stopped. When the guest is woken up, it will continue to report, > and there is no consumer of the vq (the optimization thread has exited). > There is no functional issue here, since the driver will just drop the hint > when the vq is full. From the optimization point of view, I think we can > consider to replace the above sop() function with > virtio_notify_config(vdev), which will stop the guest's unnecessary > reporting. I do not understand why we want to increment the counter on vm stop though. It does make sense to stop the thread but why not resume where we left off when vm is resumed? > > Best, > Wei In short the code will be easier to reason about if there's some symmetry in how we start/stop things. If we need the thread to run together with the VCPU then starting/stopping it from status seems like a reasonable choice. -- MST --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-19 4:24 ` [virtio-dev] " Michael S. Tsirkin @ 2018-03-19 9:01 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-19 9:01 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > OTOH it seems that if thread stops nothing will wake it up > whem vm is restarted. Such bahaviour change across vmstop/vmstart > is unexpected. > I do not understand why we want to increment the counter > on vm stop though. It does make sense to stop the thread > but why not resume where we left off when vm is resumed? > I'm not sure which counter we incremented. But it would be clear if we have a high level view of how it works (it is symmetric actually). Basically, we start the optimization when each round starts and stop it at the end of each round (i.e. before we do the bitmap sync), as shown below: 1) 1st Round starts --> free_page_start 2) 1st Round in progress.. 3) 1st Round ends --> free_page_stop 4) 2nd Round starts --> free_page_start 5) 2nd Round in progress.. 6) 2nd Round ends --> free_page_stop ...... For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the optimization thread exits immediately. That is, this optimization thread is gone forever (the optimization we can do for this round is done). We won't know when would the VM be woken up: A) If the VM is woken up very soon when the migration thread is still in progress of 2), then in 4) a new optimization thread (not the same one for the first round) will be created and start the optimization for the 2nd round as usual (If you have questions about 3) in this case, that free_page_stop will do nothing than just return, since the optimization thread has exited) ; B) If the VM is woken up after the whole migration has ended, there is still no point in resuming the optimization. I think this would be the simple design for the first release of this optimization. There are possibilities to improve case A) above by continuing optimization for the 1st Round as it is still in progress, but I think adding that complexity for this rare case wouldn't be worthwhile (at least for now). What would you think? Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-19 9:01 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-19 9:01 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > OTOH it seems that if thread stops nothing will wake it up > whem vm is restarted. Such bahaviour change across vmstop/vmstart > is unexpected. > I do not understand why we want to increment the counter > on vm stop though. It does make sense to stop the thread > but why not resume where we left off when vm is resumed? > I'm not sure which counter we incremented. But it would be clear if we have a high level view of how it works (it is symmetric actually). Basically, we start the optimization when each round starts and stop it at the end of each round (i.e. before we do the bitmap sync), as shown below: 1) 1st Round starts --> free_page_start 2) 1st Round in progress.. 3) 1st Round ends --> free_page_stop 4) 2nd Round starts --> free_page_start 5) 2nd Round in progress.. 6) 2nd Round ends --> free_page_stop ...... For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the optimization thread exits immediately. That is, this optimization thread is gone forever (the optimization we can do for this round is done). We won't know when would the VM be woken up: A) If the VM is woken up very soon when the migration thread is still in progress of 2), then in 4) a new optimization thread (not the same one for the first round) will be created and start the optimization for the 2nd round as usual (If you have questions about 3) in this case, that free_page_stop will do nothing than just return, since the optimization thread has exited) ; B) If the VM is woken up after the whole migration has ended, there is still no point in resuming the optimization. I think this would be the simple design for the first release of this optimization. There are possibilities to improve case A) above by continuing optimization for the 1st Round as it is still in progress, but I think adding that complexity for this rare case wouldn't be worthwhile (at least for now). What would you think? Best, Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-19 9:01 ` Wei Wang @ 2018-03-19 22:55 ` Michael S. Tsirkin -1 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-19 22:55 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > OTOH it seems that if thread stops nothing will wake it up > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > is unexpected. > > I do not understand why we want to increment the counter > > on vm stop though. It does make sense to stop the thread > > but why not resume where we left off when vm is resumed? > > > > I'm not sure which counter we incremented. But it would be clear if we have > a high level view of how it works (it is symmetric actually). Basically, we > start the optimization when each round starts and stop it at the end of each > round (i.e. before we do the bitmap sync), as shown below: > > 1) 1st Round starts --> free_page_start > 2) 1st Round in progress.. > 3) 1st Round ends --> free_page_stop > 4) 2nd Round starts --> free_page_start > 5) 2nd Round in progress.. > 6) 2nd Round ends --> free_page_stop > ...... > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > optimization thread exits immediately. That is, this optimization thread is > gone forever (the optimization we can do for this round is done). We won't > know when would the VM be woken up: > A) If the VM is woken up very soon when the migration thread is still in > progress of 2), then in 4) a new optimization thread (not the same one for > the first round) will be created and start the optimization for the 2nd > round as usual (If you have questions about 3) in this case, that > free_page_stop will do nothing than just return, since the optimization > thread has exited) ; > B) If the VM is woken up after the whole migration has ended, there is still > no point in resuming the optimization. > > I think this would be the simple design for the first release of this > optimization. There are possibilities to improve case A) above by continuing > optimization for the 1st Round as it is still in progress, but I think > adding that complexity for this rare case wouldn't be worthwhile (at least > for now). What would you think? > > > Best, > Wei In my opinion this just makes the patch very messy. E.g. attempts to attach a debugger to the guest will call vmstop and then behaviour changes. This is a receipe for heisenbugs which are then extremely painful to debug. It is not really hard to make things symmetrical: e.g. if you stop on vmstop then you should start on vmstart, etc. And stopping thread should not involve a bunch of state changes, just stop it and that's it. -- MST ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-19 22:55 ` Michael S. Tsirkin 0 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-19 22:55 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > OTOH it seems that if thread stops nothing will wake it up > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > is unexpected. > > I do not understand why we want to increment the counter > > on vm stop though. It does make sense to stop the thread > > but why not resume where we left off when vm is resumed? > > > > I'm not sure which counter we incremented. But it would be clear if we have > a high level view of how it works (it is symmetric actually). Basically, we > start the optimization when each round starts and stop it at the end of each > round (i.e. before we do the bitmap sync), as shown below: > > 1) 1st Round starts --> free_page_start > 2) 1st Round in progress.. > 3) 1st Round ends --> free_page_stop > 4) 2nd Round starts --> free_page_start > 5) 2nd Round in progress.. > 6) 2nd Round ends --> free_page_stop > ...... > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > optimization thread exits immediately. That is, this optimization thread is > gone forever (the optimization we can do for this round is done). We won't > know when would the VM be woken up: > A) If the VM is woken up very soon when the migration thread is still in > progress of 2), then in 4) a new optimization thread (not the same one for > the first round) will be created and start the optimization for the 2nd > round as usual (If you have questions about 3) in this case, that > free_page_stop will do nothing than just return, since the optimization > thread has exited) ; > B) If the VM is woken up after the whole migration has ended, there is still > no point in resuming the optimization. > > I think this would be the simple design for the first release of this > optimization. There are possibilities to improve case A) above by continuing > optimization for the 1st Round as it is still in progress, but I think > adding that complexity for this rare case wouldn't be worthwhile (at least > for now). What would you think? > > > Best, > Wei In my opinion this just makes the patch very messy. E.g. attempts to attach a debugger to the guest will call vmstop and then behaviour changes. This is a receipe for heisenbugs which are then extremely painful to debug. It is not really hard to make things symmetrical: e.g. if you stop on vmstop then you should start on vmstart, etc. And stopping thread should not involve a bunch of state changes, just stop it and that's it. -- MST --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-19 22:55 ` Michael S. Tsirkin @ 2018-03-20 2:16 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-20 2:16 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>> OTOH it seems that if thread stops nothing will wake it up >>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>> is unexpected. >>> I do not understand why we want to increment the counter >>> on vm stop though. It does make sense to stop the thread >>> but why not resume where we left off when vm is resumed? >>> >> I'm not sure which counter we incremented. But it would be clear if we have >> a high level view of how it works (it is symmetric actually). Basically, we >> start the optimization when each round starts and stop it at the end of each >> round (i.e. before we do the bitmap sync), as shown below: >> >> 1) 1st Round starts --> free_page_start >> 2) 1st Round in progress.. >> 3) 1st Round ends --> free_page_stop >> 4) 2nd Round starts --> free_page_start >> 5) 2nd Round in progress.. >> 6) 2nd Round ends --> free_page_stop >> ...... >> >> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >> optimization thread exits immediately. That is, this optimization thread is >> gone forever (the optimization we can do for this round is done). We won't >> know when would the VM be woken up: >> A) If the VM is woken up very soon when the migration thread is still in >> progress of 2), then in 4) a new optimization thread (not the same one for >> the first round) will be created and start the optimization for the 2nd >> round as usual (If you have questions about 3) in this case, that >> free_page_stop will do nothing than just return, since the optimization >> thread has exited) ; >> B) If the VM is woken up after the whole migration has ended, there is still >> no point in resuming the optimization. >> >> I think this would be the simple design for the first release of this >> optimization. There are possibilities to improve case A) above by continuing >> optimization for the 1st Round as it is still in progress, but I think >> adding that complexity for this rare case wouldn't be worthwhile (at least >> for now). What would you think? >> >> >> Best, >> Wei > In my opinion this just makes the patch very messy. > > E.g. attempts to attach a debugger to the guest will call vmstop and > then behaviour changes. This is a receipe for heisenbugs which are then > extremely painful to debug. > > It is not really hard to make things symmetrical: > e.g. if you stop on vmstop then you should start on vmstart, etc. > And stopping thread should not involve a bunch of state > changes, just stop it and that's it. > "stop it" - do you mean to 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit the while loop and return NULL); or 2) keep the thread staying in the while loop but yield running (e.g. sleep(1) or block on a mutex)? (or please let me know if you suggested a different implementation about stopping the thread) If we go with 1), then we would not be able to resume the thread. If we go with 2), then there is no guarantee to make the thread continue to run immediately (there will be a scheduling delay which is not predictable) when the vm is woken up. If the thread cannot run immediately when the the VM is woken up, it will be effectively the same as 1). In terms of heisenbugs, I think we can recommend developers (of this feature) to use methods that don't stop the VM (e.g. print). Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-20 2:16 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-20 2:16 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>> OTOH it seems that if thread stops nothing will wake it up >>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>> is unexpected. >>> I do not understand why we want to increment the counter >>> on vm stop though. It does make sense to stop the thread >>> but why not resume where we left off when vm is resumed? >>> >> I'm not sure which counter we incremented. But it would be clear if we have >> a high level view of how it works (it is symmetric actually). Basically, we >> start the optimization when each round starts and stop it at the end of each >> round (i.e. before we do the bitmap sync), as shown below: >> >> 1) 1st Round starts --> free_page_start >> 2) 1st Round in progress.. >> 3) 1st Round ends --> free_page_stop >> 4) 2nd Round starts --> free_page_start >> 5) 2nd Round in progress.. >> 6) 2nd Round ends --> free_page_stop >> ...... >> >> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >> optimization thread exits immediately. That is, this optimization thread is >> gone forever (the optimization we can do for this round is done). We won't >> know when would the VM be woken up: >> A) If the VM is woken up very soon when the migration thread is still in >> progress of 2), then in 4) a new optimization thread (not the same one for >> the first round) will be created and start the optimization for the 2nd >> round as usual (If you have questions about 3) in this case, that >> free_page_stop will do nothing than just return, since the optimization >> thread has exited) ; >> B) If the VM is woken up after the whole migration has ended, there is still >> no point in resuming the optimization. >> >> I think this would be the simple design for the first release of this >> optimization. There are possibilities to improve case A) above by continuing >> optimization for the 1st Round as it is still in progress, but I think >> adding that complexity for this rare case wouldn't be worthwhile (at least >> for now). What would you think? >> >> >> Best, >> Wei > In my opinion this just makes the patch very messy. > > E.g. attempts to attach a debugger to the guest will call vmstop and > then behaviour changes. This is a receipe for heisenbugs which are then > extremely painful to debug. > > It is not really hard to make things symmetrical: > e.g. if you stop on vmstop then you should start on vmstart, etc. > And stopping thread should not involve a bunch of state > changes, just stop it and that's it. > "stop it" - do you mean to 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit the while loop and return NULL); or 2) keep the thread staying in the while loop but yield running (e.g. sleep(1) or block on a mutex)? (or please let me know if you suggested a different implementation about stopping the thread) If we go with 1), then we would not be able to resume the thread. If we go with 2), then there is no guarantee to make the thread continue to run immediately (there will be a scheduling delay which is not predictable) when the vm is woken up. If the thread cannot run immediately when the the VM is woken up, it will be effectively the same as 1). In terms of heisenbugs, I think we can recommend developers (of this feature) to use methods that don't stop the VM (e.g. print). Best, Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-20 2:16 ` Wei Wang @ 2018-03-20 2:59 ` Michael S. Tsirkin -1 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-20 2:59 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: > On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > > > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > OTOH it seems that if thread stops nothing will wake it up > > > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > > > is unexpected. > > > > I do not understand why we want to increment the counter > > > > on vm stop though. It does make sense to stop the thread > > > > but why not resume where we left off when vm is resumed? > > > > > > > I'm not sure which counter we incremented. But it would be clear if we have > > > a high level view of how it works (it is symmetric actually). Basically, we > > > start the optimization when each round starts and stop it at the end of each > > > round (i.e. before we do the bitmap sync), as shown below: > > > > > > 1) 1st Round starts --> free_page_start > > > 2) 1st Round in progress.. > > > 3) 1st Round ends --> free_page_stop > > > 4) 2nd Round starts --> free_page_start > > > 5) 2nd Round in progress.. > > > 6) 2nd Round ends --> free_page_stop > > > ...... > > > > > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > > > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > > > optimization thread exits immediately. That is, this optimization thread is > > > gone forever (the optimization we can do for this round is done). We won't > > > know when would the VM be woken up: > > > A) If the VM is woken up very soon when the migration thread is still in > > > progress of 2), then in 4) a new optimization thread (not the same one for > > > the first round) will be created and start the optimization for the 2nd > > > round as usual (If you have questions about 3) in this case, that > > > free_page_stop will do nothing than just return, since the optimization > > > thread has exited) ; > > > B) If the VM is woken up after the whole migration has ended, there is still > > > no point in resuming the optimization. > > > > > > I think this would be the simple design for the first release of this > > > optimization. There are possibilities to improve case A) above by continuing > > > optimization for the 1st Round as it is still in progress, but I think > > > adding that complexity for this rare case wouldn't be worthwhile (at least > > > for now). What would you think? > > > > > > > > > Best, > > > Wei > > In my opinion this just makes the patch very messy. > > > > E.g. attempts to attach a debugger to the guest will call vmstop and > > then behaviour changes. This is a receipe for heisenbugs which are then > > extremely painful to debug. > > > > It is not really hard to make things symmetrical: > > e.g. if you stop on vmstop then you should start on vmstart, etc. > > And stopping thread should not involve a bunch of state > > changes, just stop it and that's it. > > > > "stop it" - do you mean to > 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit > the while loop and return NULL); or > 2) keep the thread staying in the while loop but yield running (e.g. > sleep(1) or block on a mutex)? (or please let me know if you suggested a > different implementation about stopping the thread) I would say it makes more sense to make it block on something. BTW I still think you are engaging in premature optimization here. What you are doing here is a "data plane for balloon". I would make the feature work first by processing this in a BH. Creating threads immediately opens up questions of isolation, cgroups etc. > If we go with 1), then we would not be able to resume the thread. > If we go with 2), then there is no guarantee to make the thread continue to > run immediately (there will be a scheduling delay which is not predictable) > when the vm is woken up. If the thread cannot run immediately when the the > VM is woken up, it will be effectively the same as 1). > > In terms of heisenbugs, I think we can recommend developers (of this > feature) to use methods that don't stop the VM (e.g. print). > > Best, > Wei Sorry this does not makes sense to me. The reality is that for most people migration works just fine. These patches implement a niche optimization. #1 priority should be to break no existing flows. -- MST ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-20 2:59 ` Michael S. Tsirkin 0 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-20 2:59 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: > On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > > > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > OTOH it seems that if thread stops nothing will wake it up > > > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > > > is unexpected. > > > > I do not understand why we want to increment the counter > > > > on vm stop though. It does make sense to stop the thread > > > > but why not resume where we left off when vm is resumed? > > > > > > > I'm not sure which counter we incremented. But it would be clear if we have > > > a high level view of how it works (it is symmetric actually). Basically, we > > > start the optimization when each round starts and stop it at the end of each > > > round (i.e. before we do the bitmap sync), as shown below: > > > > > > 1) 1st Round starts --> free_page_start > > > 2) 1st Round in progress.. > > > 3) 1st Round ends --> free_page_stop > > > 4) 2nd Round starts --> free_page_start > > > 5) 2nd Round in progress.. > > > 6) 2nd Round ends --> free_page_stop > > > ...... > > > > > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > > > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > > > optimization thread exits immediately. That is, this optimization thread is > > > gone forever (the optimization we can do for this round is done). We won't > > > know when would the VM be woken up: > > > A) If the VM is woken up very soon when the migration thread is still in > > > progress of 2), then in 4) a new optimization thread (not the same one for > > > the first round) will be created and start the optimization for the 2nd > > > round as usual (If you have questions about 3) in this case, that > > > free_page_stop will do nothing than just return, since the optimization > > > thread has exited) ; > > > B) If the VM is woken up after the whole migration has ended, there is still > > > no point in resuming the optimization. > > > > > > I think this would be the simple design for the first release of this > > > optimization. There are possibilities to improve case A) above by continuing > > > optimization for the 1st Round as it is still in progress, but I think > > > adding that complexity for this rare case wouldn't be worthwhile (at least > > > for now). What would you think? > > > > > > > > > Best, > > > Wei > > In my opinion this just makes the patch very messy. > > > > E.g. attempts to attach a debugger to the guest will call vmstop and > > then behaviour changes. This is a receipe for heisenbugs which are then > > extremely painful to debug. > > > > It is not really hard to make things symmetrical: > > e.g. if you stop on vmstop then you should start on vmstart, etc. > > And stopping thread should not involve a bunch of state > > changes, just stop it and that's it. > > > > "stop it" - do you mean to > 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit > the while loop and return NULL); or > 2) keep the thread staying in the while loop but yield running (e.g. > sleep(1) or block on a mutex)? (or please let me know if you suggested a > different implementation about stopping the thread) I would say it makes more sense to make it block on something. BTW I still think you are engaging in premature optimization here. What you are doing here is a "data plane for balloon". I would make the feature work first by processing this in a BH. Creating threads immediately opens up questions of isolation, cgroups etc. > If we go with 1), then we would not be able to resume the thread. > If we go with 2), then there is no guarantee to make the thread continue to > run immediately (there will be a scheduling delay which is not predictable) > when the vm is woken up. If the thread cannot run immediately when the the > VM is woken up, it will be effectively the same as 1). > > In terms of heisenbugs, I think we can recommend developers (of this > feature) to use methods that don't stop the VM (e.g. print). > > Best, > Wei Sorry this does not makes sense to me. The reality is that for most people migration works just fine. These patches implement a niche optimization. #1 priority should be to break no existing flows. -- MST --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-20 2:59 ` Michael S. Tsirkin @ 2018-03-20 3:18 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-20 3:18 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: > On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: >> On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: >>> On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >>>> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>>>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>>>> OTOH it seems that if thread stops nothing will wake it up >>>>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>>>> is unexpected. >>>>> I do not understand why we want to increment the counter >>>>> on vm stop though. It does make sense to stop the thread >>>>> but why not resume where we left off when vm is resumed? >>>>> >>>> I'm not sure which counter we incremented. But it would be clear if we have >>>> a high level view of how it works (it is symmetric actually). Basically, we >>>> start the optimization when each round starts and stop it at the end of each >>>> round (i.e. before we do the bitmap sync), as shown below: >>>> >>>> 1) 1st Round starts --> free_page_start >>>> 2) 1st Round in progress.. >>>> 3) 1st Round ends --> free_page_stop >>>> 4) 2nd Round starts --> free_page_start >>>> 5) 2nd Round in progress.. >>>> 6) 2nd Round ends --> free_page_stop >>>> ...... >>>> >>>> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >>>> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >>>> optimization thread exits immediately. That is, this optimization thread is >>>> gone forever (the optimization we can do for this round is done). We won't >>>> know when would the VM be woken up: >>>> A) If the VM is woken up very soon when the migration thread is still in >>>> progress of 2), then in 4) a new optimization thread (not the same one for >>>> the first round) will be created and start the optimization for the 2nd >>>> round as usual (If you have questions about 3) in this case, that >>>> free_page_stop will do nothing than just return, since the optimization >>>> thread has exited) ; >>>> B) If the VM is woken up after the whole migration has ended, there is still >>>> no point in resuming the optimization. >>>> >>>> I think this would be the simple design for the first release of this >>>> optimization. There are possibilities to improve case A) above by continuing >>>> optimization for the 1st Round as it is still in progress, but I think >>>> adding that complexity for this rare case wouldn't be worthwhile (at least >>>> for now). What would you think? >>>> >>>> >>>> Best, >>>> Wei >>> In my opinion this just makes the patch very messy. >>> >>> E.g. attempts to attach a debugger to the guest will call vmstop and >>> then behaviour changes. This is a receipe for heisenbugs which are then >>> extremely painful to debug. >>> >>> It is not really hard to make things symmetrical: >>> e.g. if you stop on vmstop then you should start on vmstart, etc. >>> And stopping thread should not involve a bunch of state >>> changes, just stop it and that's it. >>> >> "stop it" - do you mean to >> 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit >> the while loop and return NULL); or >> 2) keep the thread staying in the while loop but yield running (e.g. >> sleep(1) or block on a mutex)? (or please let me know if you suggested a >> different implementation about stopping the thread) > I would say it makes more sense to make it block on something. > > BTW I still think you are engaging in premature optimization here. > What you are doing here is a "data plane for balloon". > I would make the feature work first by processing this in a BH. > Creating threads immediately opens up questions of isolation, > cgroups etc. > Could you please share more about how creating threads affects isolation and cgroup? and how does BH solve it? Thanks. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-20 3:18 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-20 3:18 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: > On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: >> On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: >>> On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >>>> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>>>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>>>> OTOH it seems that if thread stops nothing will wake it up >>>>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>>>> is unexpected. >>>>> I do not understand why we want to increment the counter >>>>> on vm stop though. It does make sense to stop the thread >>>>> but why not resume where we left off when vm is resumed? >>>>> >>>> I'm not sure which counter we incremented. But it would be clear if we have >>>> a high level view of how it works (it is symmetric actually). Basically, we >>>> start the optimization when each round starts and stop it at the end of each >>>> round (i.e. before we do the bitmap sync), as shown below: >>>> >>>> 1) 1st Round starts --> free_page_start >>>> 2) 1st Round in progress.. >>>> 3) 1st Round ends --> free_page_stop >>>> 4) 2nd Round starts --> free_page_start >>>> 5) 2nd Round in progress.. >>>> 6) 2nd Round ends --> free_page_stop >>>> ...... >>>> >>>> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >>>> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >>>> optimization thread exits immediately. That is, this optimization thread is >>>> gone forever (the optimization we can do for this round is done). We won't >>>> know when would the VM be woken up: >>>> A) If the VM is woken up very soon when the migration thread is still in >>>> progress of 2), then in 4) a new optimization thread (not the same one for >>>> the first round) will be created and start the optimization for the 2nd >>>> round as usual (If you have questions about 3) in this case, that >>>> free_page_stop will do nothing than just return, since the optimization >>>> thread has exited) ; >>>> B) If the VM is woken up after the whole migration has ended, there is still >>>> no point in resuming the optimization. >>>> >>>> I think this would be the simple design for the first release of this >>>> optimization. There are possibilities to improve case A) above by continuing >>>> optimization for the 1st Round as it is still in progress, but I think >>>> adding that complexity for this rare case wouldn't be worthwhile (at least >>>> for now). What would you think? >>>> >>>> >>>> Best, >>>> Wei >>> In my opinion this just makes the patch very messy. >>> >>> E.g. attempts to attach a debugger to the guest will call vmstop and >>> then behaviour changes. This is a receipe for heisenbugs which are then >>> extremely painful to debug. >>> >>> It is not really hard to make things symmetrical: >>> e.g. if you stop on vmstop then you should start on vmstart, etc. >>> And stopping thread should not involve a bunch of state >>> changes, just stop it and that's it. >>> >> "stop it" - do you mean to >> 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit >> the while loop and return NULL); or >> 2) keep the thread staying in the while loop but yield running (e.g. >> sleep(1) or block on a mutex)? (or please let me know if you suggested a >> different implementation about stopping the thread) > I would say it makes more sense to make it block on something. > > BTW I still think you are engaging in premature optimization here. > What you are doing here is a "data plane for balloon". > I would make the feature work first by processing this in a BH. > Creating threads immediately opens up questions of isolation, > cgroups etc. > Could you please share more about how creating threads affects isolation and cgroup? and how does BH solve it? Thanks. Best, Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-20 3:18 ` Wei Wang @ 2018-03-20 3:24 ` Michael S. Tsirkin -1 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-20 3:24 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Tue, Mar 20, 2018 at 11:18:23AM +0800, Wei Wang wrote: > On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: > > On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: > > > On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > > > > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > > > > > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > > > > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > > > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > > > OTOH it seems that if thread stops nothing will wake it up > > > > > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > > > > > is unexpected. > > > > > > I do not understand why we want to increment the counter > > > > > > on vm stop though. It does make sense to stop the thread > > > > > > but why not resume where we left off when vm is resumed? > > > > > > > > > > > I'm not sure which counter we incremented. But it would be clear if we have > > > > > a high level view of how it works (it is symmetric actually). Basically, we > > > > > start the optimization when each round starts and stop it at the end of each > > > > > round (i.e. before we do the bitmap sync), as shown below: > > > > > > > > > > 1) 1st Round starts --> free_page_start > > > > > 2) 1st Round in progress.. > > > > > 3) 1st Round ends --> free_page_stop > > > > > 4) 2nd Round starts --> free_page_start > > > > > 5) 2nd Round in progress.. > > > > > 6) 2nd Round ends --> free_page_stop > > > > > ...... > > > > > > > > > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > > > > > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > > > > > optimization thread exits immediately. That is, this optimization thread is > > > > > gone forever (the optimization we can do for this round is done). We won't > > > > > know when would the VM be woken up: > > > > > A) If the VM is woken up very soon when the migration thread is still in > > > > > progress of 2), then in 4) a new optimization thread (not the same one for > > > > > the first round) will be created and start the optimization for the 2nd > > > > > round as usual (If you have questions about 3) in this case, that > > > > > free_page_stop will do nothing than just return, since the optimization > > > > > thread has exited) ; > > > > > B) If the VM is woken up after the whole migration has ended, there is still > > > > > no point in resuming the optimization. > > > > > > > > > > I think this would be the simple design for the first release of this > > > > > optimization. There are possibilities to improve case A) above by continuing > > > > > optimization for the 1st Round as it is still in progress, but I think > > > > > adding that complexity for this rare case wouldn't be worthwhile (at least > > > > > for now). What would you think? > > > > > > > > > > > > > > > Best, > > > > > Wei > > > > In my opinion this just makes the patch very messy. > > > > > > > > E.g. attempts to attach a debugger to the guest will call vmstop and > > > > then behaviour changes. This is a receipe for heisenbugs which are then > > > > extremely painful to debug. > > > > > > > > It is not really hard to make things symmetrical: > > > > e.g. if you stop on vmstop then you should start on vmstart, etc. > > > > And stopping thread should not involve a bunch of state > > > > changes, just stop it and that's it. > > > > > > > "stop it" - do you mean to > > > 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit > > > the while loop and return NULL); or > > > 2) keep the thread staying in the while loop but yield running (e.g. > > > sleep(1) or block on a mutex)? (or please let me know if you suggested a > > > different implementation about stopping the thread) > > I would say it makes more sense to make it block on something. > > > > BTW I still think you are engaging in premature optimization here. > > What you are doing here is a "data plane for balloon". > > I would make the feature work first by processing this in a BH. > > Creating threads immediately opens up questions of isolation, > > cgroups etc. > > > > Could you please share more about how creating threads affects isolation and > cgroup? When threads are coming and going, they are hard to control. Consider the rich API libvirt exposes for controlling the io threads: https://libvirt.org/formatdomain.html#elementsIOThreadsAllocation > and how does BH solve it? Thanks. > Best, > Wei It's late at night so I don't remember whether it's the emulator or the io thread that runs the BH, but the point is it's already controlled by libvirt. -- MST ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-20 3:24 ` Michael S. Tsirkin 0 siblings, 0 replies; 42+ messages in thread From: Michael S. Tsirkin @ 2018-03-20 3:24 UTC (permalink / raw) To: Wei Wang Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On Tue, Mar 20, 2018 at 11:18:23AM +0800, Wei Wang wrote: > On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: > > On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: > > > On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > > > > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > > > > > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > > > > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > > > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > > > OTOH it seems that if thread stops nothing will wake it up > > > > > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > > > > > is unexpected. > > > > > > I do not understand why we want to increment the counter > > > > > > on vm stop though. It does make sense to stop the thread > > > > > > but why not resume where we left off when vm is resumed? > > > > > > > > > > > I'm not sure which counter we incremented. But it would be clear if we have > > > > > a high level view of how it works (it is symmetric actually). Basically, we > > > > > start the optimization when each round starts and stop it at the end of each > > > > > round (i.e. before we do the bitmap sync), as shown below: > > > > > > > > > > 1) 1st Round starts --> free_page_start > > > > > 2) 1st Round in progress.. > > > > > 3) 1st Round ends --> free_page_stop > > > > > 4) 2nd Round starts --> free_page_start > > > > > 5) 2nd Round in progress.. > > > > > 6) 2nd Round ends --> free_page_stop > > > > > ...... > > > > > > > > > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > > > > > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > > > > > optimization thread exits immediately. That is, this optimization thread is > > > > > gone forever (the optimization we can do for this round is done). We won't > > > > > know when would the VM be woken up: > > > > > A) If the VM is woken up very soon when the migration thread is still in > > > > > progress of 2), then in 4) a new optimization thread (not the same one for > > > > > the first round) will be created and start the optimization for the 2nd > > > > > round as usual (If you have questions about 3) in this case, that > > > > > free_page_stop will do nothing than just return, since the optimization > > > > > thread has exited) ; > > > > > B) If the VM is woken up after the whole migration has ended, there is still > > > > > no point in resuming the optimization. > > > > > > > > > > I think this would be the simple design for the first release of this > > > > > optimization. There are possibilities to improve case A) above by continuing > > > > > optimization for the 1st Round as it is still in progress, but I think > > > > > adding that complexity for this rare case wouldn't be worthwhile (at least > > > > > for now). What would you think? > > > > > > > > > > > > > > > Best, > > > > > Wei > > > > In my opinion this just makes the patch very messy. > > > > > > > > E.g. attempts to attach a debugger to the guest will call vmstop and > > > > then behaviour changes. This is a receipe for heisenbugs which are then > > > > extremely painful to debug. > > > > > > > > It is not really hard to make things symmetrical: > > > > e.g. if you stop on vmstop then you should start on vmstart, etc. > > > > And stopping thread should not involve a bunch of state > > > > changes, just stop it and that's it. > > > > > > > "stop it" - do you mean to > > > 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit > > > the while loop and return NULL); or > > > 2) keep the thread staying in the while loop but yield running (e.g. > > > sleep(1) or block on a mutex)? (or please let me know if you suggested a > > > different implementation about stopping the thread) > > I would say it makes more sense to make it block on something. > > > > BTW I still think you are engaging in premature optimization here. > > What you are doing here is a "data plane for balloon". > > I would make the feature work first by processing this in a BH. > > Creating threads immediately opens up questions of isolation, > > cgroups etc. > > > > Could you please share more about how creating threads affects isolation and > cgroup? When threads are coming and going, they are hard to control. Consider the rich API libvirt exposes for controlling the io threads: https://libvirt.org/formatdomain.html#elementsIOThreadsAllocation > and how does BH solve it? Thanks. > Best, > Wei It's late at night so I don't remember whether it's the emulator or the io thread that runs the BH, but the point is it's already controlled by libvirt. -- MST --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-20 3:24 ` Michael S. Tsirkin @ 2018-03-22 3:13 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-22 3:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 11:24 AM, Michael S. Tsirkin wrote: > On Tue, Mar 20, 2018 at 11:18:23AM +0800, Wei Wang wrote: >> On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: >>> On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: >>>> On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: >>>>> On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >>>>>> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>>>>>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>>>>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>>>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>>>>>> OTOH it seems that if thread stops nothing will wake it up >>>>>>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>>>>>> is unexpected. >>>>>>> I do not understand why we want to increment the counter >>>>>>> on vm stop though. It does make sense to stop the thread >>>>>>> but why not resume where we left off when vm is resumed? >>>>>>> >>>>>> I'm not sure which counter we incremented. But it would be clear if we have >>>>>> a high level view of how it works (it is symmetric actually). Basically, we >>>>>> start the optimization when each round starts and stop it at the end of each >>>>>> round (i.e. before we do the bitmap sync), as shown below: >>>>>> >>>>>> 1) 1st Round starts --> free_page_start >>>>>> 2) 1st Round in progress.. >>>>>> 3) 1st Round ends --> free_page_stop >>>>>> 4) 2nd Round starts --> free_page_start >>>>>> 5) 2nd Round in progress.. >>>>>> 6) 2nd Round ends --> free_page_stop >>>>>> ...... >>>>>> >>>>>> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >>>>>> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >>>>>> optimization thread exits immediately. That is, this optimization thread is >>>>>> gone forever (the optimization we can do for this round is done). We won't >>>>>> know when would the VM be woken up: >>>>>> A) If the VM is woken up very soon when the migration thread is still in >>>>>> progress of 2), then in 4) a new optimization thread (not the same one for >>>>>> the first round) will be created and start the optimization for the 2nd >>>>>> round as usual (If you have questions about 3) in this case, that >>>>>> free_page_stop will do nothing than just return, since the optimization >>>>>> thread has exited) ; >>>>>> B) If the VM is woken up after the whole migration has ended, there is still >>>>>> no point in resuming the optimization. >>>>>> >>>>>> I think this would be the simple design for the first release of this >>>>>> optimization. There are possibilities to improve case A) above by continuing >>>>>> optimization for the 1st Round as it is still in progress, but I think >>>>>> adding that complexity for this rare case wouldn't be worthwhile (at least >>>>>> for now). What would you think? >>>>>> >>>>>> >>>>>> Best, >>>>>> Wei >>>>> In my opinion this just makes the patch very messy. >>>>> >>>>> E.g. attempts to attach a debugger to the guest will call vmstop and >>>>> then behaviour changes. This is a receipe for heisenbugs which are then >>>>> extremely painful to debug. >>>>> >>>>> It is not really hard to make things symmetrical: >>>>> e.g. if you stop on vmstop then you should start on vmstart, etc. >>>>> And stopping thread should not involve a bunch of state >>>>> changes, just stop it and that's it. >>>>> >>>> "stop it" - do you mean to >>>> 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit >>>> the while loop and return NULL); or >>>> 2) keep the thread staying in the while loop but yield running (e.g. >>>> sleep(1) or block on a mutex)? (or please let me know if you suggested a >>>> different implementation about stopping the thread) >>> I would say it makes more sense to make it block on something. >>> >>> BTW I still think you are engaging in premature optimization here. >>> What you are doing here is a "data plane for balloon". >>> I would make the feature work first by processing this in a BH. >>> Creating threads immediately opens up questions of isolation, >>> cgroups etc. >>> >> Could you please share more about how creating threads affects isolation and >> cgroup? > When threads are coming and going, they are hard to control. > > Consider the rich API libvirt exposes for controlling the io threads: > > https://libvirt.org/formatdomain.html#elementsIOThreadsAllocation > > >> and how does BH solve it? Thanks. >> Best, >> Wei > It's late at night so I don't remember whether it's the emulator > or the io thread that runs the BH, but the point is it's > already controlled by libvirt. > OK. I've tried to implement it this way: create an iothread via the qemu cmdline option: --device virtio-balloon,free-page-hint=true,iothread=iothread10, and schedule a BH to run in the iothread context when free_page_start() is called. I think the drawback of this method is that the iothread exists during the whole QEMU lifetime, which wastes CPU cycles (e.g. when live migration isn't in progress). The method in v5 is like migration thread, which comes only when the migration request is received and goes when migration is done. Any thought? Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-22 3:13 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-22 3:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 11:24 AM, Michael S. Tsirkin wrote: > On Tue, Mar 20, 2018 at 11:18:23AM +0800, Wei Wang wrote: >> On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: >>> On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: >>>> On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: >>>>> On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >>>>>> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>>>>>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>>>>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>>>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>>>>>> OTOH it seems that if thread stops nothing will wake it up >>>>>>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>>>>>> is unexpected. >>>>>>> I do not understand why we want to increment the counter >>>>>>> on vm stop though. It does make sense to stop the thread >>>>>>> but why not resume where we left off when vm is resumed? >>>>>>> >>>>>> I'm not sure which counter we incremented. But it would be clear if we have >>>>>> a high level view of how it works (it is symmetric actually). Basically, we >>>>>> start the optimization when each round starts and stop it at the end of each >>>>>> round (i.e. before we do the bitmap sync), as shown below: >>>>>> >>>>>> 1) 1st Round starts --> free_page_start >>>>>> 2) 1st Round in progress.. >>>>>> 3) 1st Round ends --> free_page_stop >>>>>> 4) 2nd Round starts --> free_page_start >>>>>> 5) 2nd Round in progress.. >>>>>> 6) 2nd Round ends --> free_page_stop >>>>>> ...... >>>>>> >>>>>> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >>>>>> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >>>>>> optimization thread exits immediately. That is, this optimization thread is >>>>>> gone forever (the optimization we can do for this round is done). We won't >>>>>> know when would the VM be woken up: >>>>>> A) If the VM is woken up very soon when the migration thread is still in >>>>>> progress of 2), then in 4) a new optimization thread (not the same one for >>>>>> the first round) will be created and start the optimization for the 2nd >>>>>> round as usual (If you have questions about 3) in this case, that >>>>>> free_page_stop will do nothing than just return, since the optimization >>>>>> thread has exited) ; >>>>>> B) If the VM is woken up after the whole migration has ended, there is still >>>>>> no point in resuming the optimization. >>>>>> >>>>>> I think this would be the simple design for the first release of this >>>>>> optimization. There are possibilities to improve case A) above by continuing >>>>>> optimization for the 1st Round as it is still in progress, but I think >>>>>> adding that complexity for this rare case wouldn't be worthwhile (at least >>>>>> for now). What would you think? >>>>>> >>>>>> >>>>>> Best, >>>>>> Wei >>>>> In my opinion this just makes the patch very messy. >>>>> >>>>> E.g. attempts to attach a debugger to the guest will call vmstop and >>>>> then behaviour changes. This is a receipe for heisenbugs which are then >>>>> extremely painful to debug. >>>>> >>>>> It is not really hard to make things symmetrical: >>>>> e.g. if you stop on vmstop then you should start on vmstart, etc. >>>>> And stopping thread should not involve a bunch of state >>>>> changes, just stop it and that's it. >>>>> >>>> "stop it" - do you mean to >>>> 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit >>>> the while loop and return NULL); or >>>> 2) keep the thread staying in the while loop but yield running (e.g. >>>> sleep(1) or block on a mutex)? (or please let me know if you suggested a >>>> different implementation about stopping the thread) >>> I would say it makes more sense to make it block on something. >>> >>> BTW I still think you are engaging in premature optimization here. >>> What you are doing here is a "data plane for balloon". >>> I would make the feature work first by processing this in a BH. >>> Creating threads immediately opens up questions of isolation, >>> cgroups etc. >>> >> Could you please share more about how creating threads affects isolation and >> cgroup? > When threads are coming and going, they are hard to control. > > Consider the rich API libvirt exposes for controlling the io threads: > > https://libvirt.org/formatdomain.html#elementsIOThreadsAllocation > > >> and how does BH solve it? Thanks. >> Best, >> Wei > It's late at night so I don't remember whether it's the emulator > or the io thread that runs the BH, but the point is it's > already controlled by libvirt. > OK. I've tried to implement it this way: create an iothread via the qemu cmdline option: --device virtio-balloon,free-page-hint=true,iothread=iothread10, and schedule a BH to run in the iothread context when free_page_start() is called. I think the drawback of this method is that the iothread exists during the whole QEMU lifetime, which wastes CPU cycles (e.g. when live migration isn't in progress). The method in v5 is like migration thread, which comes only when the migration request is received and goes when migration is done. Any thought? Best, Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-22 3:13 ` Wei Wang @ 2018-03-26 10:58 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-26 10:58 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/22/2018 11:13 AM, Wei Wang wrote: > > OK. I've tried to implement it this way: create an iothread via the > qemu cmdline option: --device > virtio-balloon,free-page-hint=true,iothread=iothread10, and schedule a > BH to run in the iothread context when free_page_start() is called. > > I think the drawback of this method is that the iothread exists during > the whole QEMU lifetime, which wastes CPU cycles (e.g. when live > migration isn't in progress). The method in v5 is like migration > thread, which comes only when the migration request is received and > goes when migration is done. Any thought? > Hi Michael, Would it be acceptable to go with the thread creation method for now? I would prefer that method for the above reasons. If you are very confident about the iothread method, I can also send out the patches with that implementation. Hope we could finish this feature soon. Thanks. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-26 10:58 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-26 10:58 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/22/2018 11:13 AM, Wei Wang wrote: > > OK. I've tried to implement it this way: create an iothread via the > qemu cmdline option: --device > virtio-balloon,free-page-hint=true,iothread=iothread10, and schedule a > BH to run in the iothread context when free_page_start() is called. > > I think the drawback of this method is that the iothread exists during > the whole QEMU lifetime, which wastes CPU cycles (e.g. when live > migration isn't in progress). The method in v5 is like migration > thread, which comes only when the migration request is received and > goes when migration is done. Any thought? > Hi Michael, Would it be acceptable to go with the thread creation method for now? I would prefer that method for the above reasons. If you are very confident about the iothread method, I can also send out the patches with that implementation. Hope we could finish this feature soon. Thanks. Best, Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-20 3:24 ` Michael S. Tsirkin (?) (?) @ 2018-03-26 11:09 ` Daniel P. Berrangé 2018-03-26 14:54 ` [virtio-dev] " Wang, Wei W -1 siblings, 1 reply; 42+ messages in thread From: Daniel P. Berrangé @ 2018-03-26 11:09 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Wei Wang, yang.zhang.wz, virtio-dev, quan.xu0, quintela, liliang.opensource, qemu-devel, dgilbert, pbonzini, nilal On Tue, Mar 20, 2018 at 05:24:39AM +0200, Michael S. Tsirkin wrote: > On Tue, Mar 20, 2018 at 11:18:23AM +0800, Wei Wang wrote: > > On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: > > > On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: > > > > On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > > > > > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > > > > > > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > > > > > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > > > > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > > > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > > > > OTOH it seems that if thread stops nothing will wake it up > > > > > > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > > > > > > is unexpected. > > > > > > > I do not understand why we want to increment the counter > > > > > > > on vm stop though. It does make sense to stop the thread > > > > > > > but why not resume where we left off when vm is resumed? > > > > > > > > > > > > > I'm not sure which counter we incremented. But it would be clear if we have > > > > > > a high level view of how it works (it is symmetric actually). Basically, we > > > > > > start the optimization when each round starts and stop it at the end of each > > > > > > round (i.e. before we do the bitmap sync), as shown below: > > > > > > > > > > > > 1) 1st Round starts --> free_page_start > > > > > > 2) 1st Round in progress.. > > > > > > 3) 1st Round ends --> free_page_stop > > > > > > 4) 2nd Round starts --> free_page_start > > > > > > 5) 2nd Round in progress.. > > > > > > 6) 2nd Round ends --> free_page_stop > > > > > > ...... > > > > > > > > > > > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > > > > > > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > > > > > > optimization thread exits immediately. That is, this optimization thread is > > > > > > gone forever (the optimization we can do for this round is done). We won't > > > > > > know when would the VM be woken up: > > > > > > A) If the VM is woken up very soon when the migration thread is still in > > > > > > progress of 2), then in 4) a new optimization thread (not the same one for > > > > > > the first round) will be created and start the optimization for the 2nd > > > > > > round as usual (If you have questions about 3) in this case, that > > > > > > free_page_stop will do nothing than just return, since the optimization > > > > > > thread has exited) ; > > > > > > B) If the VM is woken up after the whole migration has ended, there is still > > > > > > no point in resuming the optimization. > > > > > > > > > > > > I think this would be the simple design for the first release of this > > > > > > optimization. There are possibilities to improve case A) above by continuing > > > > > > optimization for the 1st Round as it is still in progress, but I think > > > > > > adding that complexity for this rare case wouldn't be worthwhile (at least > > > > > > for now). What would you think? > > > > > > > > > > > > > > > > > > Best, > > > > > > Wei > > > > > In my opinion this just makes the patch very messy. > > > > > > > > > > E.g. attempts to attach a debugger to the guest will call vmstop and > > > > > then behaviour changes. This is a receipe for heisenbugs which are then > > > > > extremely painful to debug. > > > > > > > > > > It is not really hard to make things symmetrical: > > > > > e.g. if you stop on vmstop then you should start on vmstart, etc. > > > > > And stopping thread should not involve a bunch of state > > > > > changes, just stop it and that's it. > > > > > > > > > "stop it" - do you mean to > > > > 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit > > > > the while loop and return NULL); or > > > > 2) keep the thread staying in the while loop but yield running (e.g. > > > > sleep(1) or block on a mutex)? (or please let me know if you suggested a > > > > different implementation about stopping the thread) > > > I would say it makes more sense to make it block on something. > > > > > > BTW I still think you are engaging in premature optimization here. > > > What you are doing here is a "data plane for balloon". > > > I would make the feature work first by processing this in a BH. > > > Creating threads immediately opens up questions of isolation, > > > cgroups etc. > > > > > > > Could you please share more about how creating threads affects isolation and > > cgroup? > > When threads are coming and going, they are hard to control. > > Consider the rich API libvirt exposes for controlling the io threads: > > https://libvirt.org/formatdomain.html#elementsIOThreadsAllocation > > > > and how does BH solve it? Thanks. > > Best, > > Wei > > It's late at night so I don't remember whether it's the emulator > or the io thread that runs the BH, but the point is it's > already controlled by libvirt. As far as libvirt is concerned there are three sets of threads it provides control over - vCPUs - each VCPU in KVM has a thread. Libvirt provides per-thread tunable control - IOThreads - each named I/O thread can be associated with one or more devices. Libvirt provides per-thread tunable control. - Emulator - any other QEMU thread which isn't an vCPU thread or IO thread gets called an emulator thread by libvirt. There is no-per thread tunable control - we can set tunables for entire set of emulator threads at once. So, if this balloon driver thread needs to support tuning controls separately from other general purpose QEMU threads, then it would ideally use iothread infrastructure. I don't particularly understand what this code is doing, but please consider whether NUMA has any impact on the work done in this thread. Specifically when the guest has multiple virtual NUMA nodes, each associated with a specific host NUMA node. If there is any memory intensive work being done here, then it might need to be executed on the correct host NUMA node according to the memory region being touched. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-26 11:09 ` [Qemu-devel] " Daniel P. Berrangé @ 2018-03-26 14:54 ` Wang, Wei W 0 siblings, 0 replies; 42+ messages in thread From: Wang, Wei W @ 2018-03-26 14:54 UTC (permalink / raw) To: Daniel P. Berrangé, Michael S. Tsirkin Cc: yang.zhang.wz, virtio-dev, quan.xu0, quintela, liliang.opensource, qemu-devel, dgilbert, pbonzini, nilal On Monday, March 26, 2018 7:09 PM, Daniel P. Berrangé wrote: > > As far as libvirt is concerned there are three sets of threads it provides > control over > > - vCPUs - each VCPU in KVM has a thread. Libvirt provides per-thread > tunable control > > - IOThreads - each named I/O thread can be associated with one or more > devices. Libvirt provides per-thread tunable control. > > - Emulator - any other QEMU thread which isn't an vCPU thread or IO thread > gets called an emulator thread by libvirt. There is no-per thread > tunable control - we can set tunables for entire set of emulator threads > at once. > Hi Daniel, Thanks for sharing the details, they are very helpful. I still have a question: There is no fundamental difference between iothread and our optimization thread (it is similar to the migration thread, which is created when migration begins and terminated when migration is done) - both of them are pthreads and each has a name. Could we also add the similar per-thread tunable control in libvirt for such threads? For example, in QEMU we can add a new migration qmp command, migrate_enable_free_page_optimization (just like other commands migrate_set_speed 10G), this command will create the optimization thread. In this way, creation of the thread is in libvirt's control, and libvirt can then support tuning the thread (e.g. pin it to any pCPU), right? > So, if this balloon driver thread needs to support tuning controls separately > from other general purpose QEMU threads, then it would ideally use > iothread infrastructure. > > I don't particularly understand what this code is doing, but please consider > whether NUMA has any impact on the work done in this thread. Specifically > when the guest has multiple virtual NUMA nodes, each associated with a > specific host NUMA node. If there is any memory intensive work being done > here, then it might need to be executed on the correct host NUMA node > according to the memory region being touched. > I think it would not be significantly impacted by NUMA, because this optimization thread doesn’t access to the guest memory a lot except the virtqueue (even with iothread, we may still not know which pCPU to pin to match virtqueue in the vNUMA case). Essentially, it gets the free page address and length, then clears bits from the migration dirty bitmap, which is allocated by QEMU itself. So, I think adding the tunable support is nicer, but I'm not sure if that would be required. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* [virtio-dev] RE: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-26 14:54 ` Wang, Wei W 0 siblings, 0 replies; 42+ messages in thread From: Wang, Wei W @ 2018-03-26 14:54 UTC (permalink / raw) To: Daniel P. Berrangé, Michael S. Tsirkin Cc: yang.zhang.wz, virtio-dev, quan.xu0, quintela, liliang.opensource, qemu-devel, dgilbert, pbonzini, nilal On Monday, March 26, 2018 7:09 PM, Daniel P. Berrangé wrote: > > As far as libvirt is concerned there are three sets of threads it provides > control over > > - vCPUs - each VCPU in KVM has a thread. Libvirt provides per-thread > tunable control > > - IOThreads - each named I/O thread can be associated with one or more > devices. Libvirt provides per-thread tunable control. > > - Emulator - any other QEMU thread which isn't an vCPU thread or IO thread > gets called an emulator thread by libvirt. There is no-per thread > tunable control - we can set tunables for entire set of emulator threads > at once. > Hi Daniel, Thanks for sharing the details, they are very helpful. I still have a question: There is no fundamental difference between iothread and our optimization thread (it is similar to the migration thread, which is created when migration begins and terminated when migration is done) - both of them are pthreads and each has a name. Could we also add the similar per-thread tunable control in libvirt for such threads? For example, in QEMU we can add a new migration qmp command, migrate_enable_free_page_optimization (just like other commands migrate_set_speed 10G), this command will create the optimization thread. In this way, creation of the thread is in libvirt's control, and libvirt can then support tuning the thread (e.g. pin it to any pCPU), right? > So, if this balloon driver thread needs to support tuning controls separately > from other general purpose QEMU threads, then it would ideally use > iothread infrastructure. > > I don't particularly understand what this code is doing, but please consider > whether NUMA has any impact on the work done in this thread. Specifically > when the guest has multiple virtual NUMA nodes, each associated with a > specific host NUMA node. If there is any memory intensive work being done > here, then it might need to be executed on the correct host NUMA node > according to the memory region being touched. > I think it would not be significantly impacted by NUMA, because this optimization thread doesn’t access to the guest memory a lot except the virtqueue (even with iothread, we may still not know which pCPU to pin to match virtqueue in the vNUMA case). Essentially, it gets the free page address and length, then clears bits from the migration dirty bitmap, which is allocated by QEMU itself. So, I think adding the tunable support is nicer, but I'm not sure if that would be required. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-26 14:54 ` [virtio-dev] " Wang, Wei W (?) @ 2018-03-26 15:04 ` Daniel P. Berrangé 2018-03-26 15:35 ` [virtio-dev] " Wang, Wei W -1 siblings, 1 reply; 42+ messages in thread From: Daniel P. Berrangé @ 2018-03-26 15:04 UTC (permalink / raw) To: Wang, Wei W Cc: Michael S. Tsirkin, yang.zhang.wz, virtio-dev, quan.xu0, quintela, liliang.opensource, qemu-devel, dgilbert, pbonzini, nilal On Mon, Mar 26, 2018 at 02:54:45PM +0000, Wang, Wei W wrote: > On Monday, March 26, 2018 7:09 PM, Daniel P. Berrangé wrote: > > > > As far as libvirt is concerned there are three sets of threads it provides > > control over > > > > - vCPUs - each VCPU in KVM has a thread. Libvirt provides per-thread > > tunable control > > > > - IOThreads - each named I/O thread can be associated with one or more > > devices. Libvirt provides per-thread tunable control. > > > > - Emulator - any other QEMU thread which isn't an vCPU thread or IO thread > > gets called an emulator thread by libvirt. There is no-per thread > > tunable control - we can set tunables for entire set of emulator threads > > at once. > > > > > Hi Daniel, > Thanks for sharing the details, they are very helpful. I still have a question: > > There is no fundamental difference between iothread and our optimization > thread (it is similar to the migration thread, which is created when > migration begins and terminated when migration is done) - both of them are > pthreads and each has a name. Could we also add the similar per-thread > tunable control in libvirt for such threads? > > For example, in QEMU we can add a new migration qmp command, > migrate_enable_free_page_optimization (just like other commands > migrate_set_speed 10G), this command will create the optimization thread. > In this way, creation of the thread is in libvirt's control, and libvirt > can then support tuning the thread (e.g. pin it to any pCPU), right? Anything is possible from a technical level, but from a design POV we would rather not start exposing tunables for arbitrary device type specific threads as they are inherantly non-portable and may not even exist in QEMU long term as it is just an artifact of the current QEMU implementation. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-26 15:04 ` Daniel P. Berrangé @ 2018-03-26 15:35 ` Wang, Wei W 0 siblings, 0 replies; 42+ messages in thread From: Wang, Wei W @ 2018-03-26 15:35 UTC (permalink / raw) To: Daniel P. Berrangé Cc: Michael S. Tsirkin, yang.zhang.wz, virtio-dev, quan.xu0, quintela, liliang.opensource, qemu-devel, dgilbert, pbonzini, nilal On Monday, March 26, 2018 11:04 PM, Daniel P. Berrangé wrote: > On Mon, Mar 26, 2018 at 02:54:45PM +0000, Wang, Wei W wrote: > > On Monday, March 26, 2018 7:09 PM, Daniel P. Berrangé wrote: > > > > > > As far as libvirt is concerned there are three sets of threads it > > > provides control over > > > > > > - vCPUs - each VCPU in KVM has a thread. Libvirt provides per-thread > > > tunable control > > > > > > - IOThreads - each named I/O thread can be associated with one or more > > > devices. Libvirt provides per-thread tunable control. > > > > > > - Emulator - any other QEMU thread which isn't an vCPU thread or IO > thread > > > gets called an emulator thread by libvirt. There is no-per thread > > > tunable control - we can set tunables for entire set of emulator threads > > > at once. > > > > > > > > > Hi Daniel, > > Thanks for sharing the details, they are very helpful. I still have a question: > > > > There is no fundamental difference between iothread and our > > optimization thread (it is similar to the migration thread, which is > > created when migration begins and terminated when migration is done) - > > both of them are pthreads and each has a name. Could we also add the > > similar per-thread tunable control in libvirt for such threads? > > > > For example, in QEMU we can add a new migration qmp command, > > migrate_enable_free_page_optimization (just like other commands > > migrate_set_speed 10G), this command will create the optimization thread. > > In this way, creation of the thread is in libvirt's control, and > > libvirt can then support tuning the thread (e.g. pin it to any pCPU), right? > > Anything is possible from a technical level, but from a design POV we would > rather not start exposing tunables for arbitrary device type specific threads > as they are inherantly non-portable and may not even exist in QEMU long > term as it is just an artifact of the current QEMU implementation. > OK. My actual concern with iothread is that it exists during the whole QEMU lifecycle. Block device using it is fine since block device access happens frequently during the QEMU lifecycle. Usages like live migration, if they happen once per day, running this iothread would be a waste of CPU cycles. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* [virtio-dev] RE: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-26 15:35 ` Wang, Wei W 0 siblings, 0 replies; 42+ messages in thread From: Wang, Wei W @ 2018-03-26 15:35 UTC (permalink / raw) To: Daniel P. Berrangé Cc: Michael S. Tsirkin, yang.zhang.wz, virtio-dev, quan.xu0, quintela, liliang.opensource, qemu-devel, dgilbert, pbonzini, nilal On Monday, March 26, 2018 11:04 PM, Daniel P. Berrangé wrote: > On Mon, Mar 26, 2018 at 02:54:45PM +0000, Wang, Wei W wrote: > > On Monday, March 26, 2018 7:09 PM, Daniel P. Berrangé wrote: > > > > > > As far as libvirt is concerned there are three sets of threads it > > > provides control over > > > > > > - vCPUs - each VCPU in KVM has a thread. Libvirt provides per-thread > > > tunable control > > > > > > - IOThreads - each named I/O thread can be associated with one or more > > > devices. Libvirt provides per-thread tunable control. > > > > > > - Emulator - any other QEMU thread which isn't an vCPU thread or IO > thread > > > gets called an emulator thread by libvirt. There is no-per thread > > > tunable control - we can set tunables for entire set of emulator threads > > > at once. > > > > > > > > > Hi Daniel, > > Thanks for sharing the details, they are very helpful. I still have a question: > > > > There is no fundamental difference between iothread and our > > optimization thread (it is similar to the migration thread, which is > > created when migration begins and terminated when migration is done) - > > both of them are pthreads and each has a name. Could we also add the > > similar per-thread tunable control in libvirt for such threads? > > > > For example, in QEMU we can add a new migration qmp command, > > migrate_enable_free_page_optimization (just like other commands > > migrate_set_speed 10G), this command will create the optimization thread. > > In this way, creation of the thread is in libvirt's control, and > > libvirt can then support tuning the thread (e.g. pin it to any pCPU), right? > > Anything is possible from a technical level, but from a design POV we would > rather not start exposing tunables for arbitrary device type specific threads > as they are inherantly non-portable and may not even exist in QEMU long > term as it is just an artifact of the current QEMU implementation. > OK. My actual concern with iothread is that it exists during the whole QEMU lifecycle. Block device using it is fine since block device access happens frequently during the QEMU lifecycle. Usages like live migration, if they happen once per day, running this iothread would be a waste of CPU cycles. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT 2018-03-20 3:24 ` Michael S. Tsirkin @ 2018-03-30 9:52 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-30 9:52 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 11:24 AM, Michael S. Tsirkin wrote: > On Tue, Mar 20, 2018 at 11:18:23AM +0800, Wei Wang wrote: >> On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: >>> On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: >>>> On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: >>>>> On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >>>>>> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>>>>>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>>>>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>>>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>>>>>> OTOH it seems that if thread stops nothing will wake it up >>>>>>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>>>>>> is unexpected. >>>>>>> I do not understand why we want to increment the counter >>>>>>> on vm stop though. It does make sense to stop the thread >>>>>>> but why not resume where we left off when vm is resumed? >>>>>>> >>>>>> I'm not sure which counter we incremented. But it would be clear if we have >>>>>> a high level view of how it works (it is symmetric actually). Basically, we >>>>>> start the optimization when each round starts and stop it at the end of each >>>>>> round (i.e. before we do the bitmap sync), as shown below: >>>>>> >>>>>> 1) 1st Round starts --> free_page_start >>>>>> 2) 1st Round in progress.. >>>>>> 3) 1st Round ends --> free_page_stop >>>>>> 4) 2nd Round starts --> free_page_start >>>>>> 5) 2nd Round in progress.. >>>>>> 6) 2nd Round ends --> free_page_stop >>>>>> ...... >>>>>> >>>>>> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >>>>>> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >>>>>> optimization thread exits immediately. That is, this optimization thread is >>>>>> gone forever (the optimization we can do for this round is done). We won't >>>>>> know when would the VM be woken up: >>>>>> A) If the VM is woken up very soon when the migration thread is still in >>>>>> progress of 2), then in 4) a new optimization thread (not the same one for >>>>>> the first round) will be created and start the optimization for the 2nd >>>>>> round as usual (If you have questions about 3) in this case, that >>>>>> free_page_stop will do nothing than just return, since the optimization >>>>>> thread has exited) ; >>>>>> B) If the VM is woken up after the whole migration has ended, there is still >>>>>> no point in resuming the optimization. >>>>>> >>>>>> I think this would be the simple design for the first release of this >>>>>> optimization. There are possibilities to improve case A) above by continuing >>>>>> optimization for the 1st Round as it is still in progress, but I think >>>>>> adding that complexity for this rare case wouldn't be worthwhile (at least >>>>>> for now). What would you think? >>>>>> >>>>>> >>>>>> Best, >>>>>> Wei >>>>> In my opinion this just makes the patch very messy. >>>>> >>>>> E.g. attempts to attach a debugger to the guest will call vmstop and >>>>> then behaviour changes. This is a receipe for heisenbugs which are then >>>>> extremely painful to debug. >>>>> >>>>> It is not really hard to make things symmetrical: >>>>> e.g. if you stop on vmstop then you should start on vmstart, etc. >>>>> And stopping thread should not involve a bunch of state >>>>> changes, just stop it and that's it. >>>>> >>>> "stop it" - do you mean to >>>> 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit >>>> the while loop and return NULL); or >>>> 2) keep the thread staying in the while loop but yield running (e.g. >>>> sleep(1) or block on a mutex)? (or please let me know if you suggested a >>>> different implementation about stopping the thread) >>> I would say it makes more sense to make it block on something. >>> >>> BTW I still think you are engaging in premature optimization here. >>> What you are doing here is a "data plane for balloon". >>> I would make the feature work first by processing this in a BH. >>> Creating threads immediately opens up questions of isolation, >>> cgroups etc. >>> >> Could you please share more about how creating threads affects isolation and >> cgroup? > When threads are coming and going, they are hard to control. > > Consider the rich API libvirt exposes for controlling the io threads: > > https://libvirt.org/formatdomain.html#elementsIOThreadsAllocation > > >> and how does BH solve it? Thanks. >> Best, >> Wei > It's late at night so I don't remember whether it's the emulator > or the io thread that runs the BH, but the point is it's > already controlled by libvirt. > Thanks all for reviewing the patches. I've changed to use iothread in the new version. Please have a check if that would be good enough. Best, Wei ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT @ 2018-03-30 9:52 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-30 9:52 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel On 03/20/2018 11:24 AM, Michael S. Tsirkin wrote: > On Tue, Mar 20, 2018 at 11:18:23AM +0800, Wei Wang wrote: >> On 03/20/2018 10:59 AM, Michael S. Tsirkin wrote: >>> On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: >>>> On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: >>>>> On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: >>>>>> On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: >>>>>>> On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: >>>>>>>> On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: >>>>>>>>> On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: >>>>>>> OTOH it seems that if thread stops nothing will wake it up >>>>>>> whem vm is restarted. Such bahaviour change across vmstop/vmstart >>>>>>> is unexpected. >>>>>>> I do not understand why we want to increment the counter >>>>>>> on vm stop though. It does make sense to stop the thread >>>>>>> but why not resume where we left off when vm is resumed? >>>>>>> >>>>>> I'm not sure which counter we incremented. But it would be clear if we have >>>>>> a high level view of how it works (it is symmetric actually). Basically, we >>>>>> start the optimization when each round starts and stop it at the end of each >>>>>> round (i.e. before we do the bitmap sync), as shown below: >>>>>> >>>>>> 1) 1st Round starts --> free_page_start >>>>>> 2) 1st Round in progress.. >>>>>> 3) 1st Round ends --> free_page_stop >>>>>> 4) 2nd Round starts --> free_page_start >>>>>> 5) 2nd Round in progress.. >>>>>> 6) 2nd Round ends --> free_page_stop >>>>>> ...... >>>>>> >>>>>> For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints >>>>>> finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the >>>>>> optimization thread exits immediately. That is, this optimization thread is >>>>>> gone forever (the optimization we can do for this round is done). We won't >>>>>> know when would the VM be woken up: >>>>>> A) If the VM is woken up very soon when the migration thread is still in >>>>>> progress of 2), then in 4) a new optimization thread (not the same one for >>>>>> the first round) will be created and start the optimization for the 2nd >>>>>> round as usual (If you have questions about 3) in this case, that >>>>>> free_page_stop will do nothing than just return, since the optimization >>>>>> thread has exited) ; >>>>>> B) If the VM is woken up after the whole migration has ended, there is still >>>>>> no point in resuming the optimization. >>>>>> >>>>>> I think this would be the simple design for the first release of this >>>>>> optimization. There are possibilities to improve case A) above by continuing >>>>>> optimization for the 1st Round as it is still in progress, but I think >>>>>> adding that complexity for this rare case wouldn't be worthwhile (at least >>>>>> for now). What would you think? >>>>>> >>>>>> >>>>>> Best, >>>>>> Wei >>>>> In my opinion this just makes the patch very messy. >>>>> >>>>> E.g. attempts to attach a debugger to the guest will call vmstop and >>>>> then behaviour changes. This is a receipe for heisenbugs which are then >>>>> extremely painful to debug. >>>>> >>>>> It is not really hard to make things symmetrical: >>>>> e.g. if you stop on vmstop then you should start on vmstart, etc. >>>>> And stopping thread should not involve a bunch of state >>>>> changes, just stop it and that's it. >>>>> >>>> "stop it" - do you mean to >>>> 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit >>>> the while loop and return NULL); or >>>> 2) keep the thread staying in the while loop but yield running (e.g. >>>> sleep(1) or block on a mutex)? (or please let me know if you suggested a >>>> different implementation about stopping the thread) >>> I would say it makes more sense to make it block on something. >>> >>> BTW I still think you are engaging in premature optimization here. >>> What you are doing here is a "data plane for balloon". >>> I would make the feature work first by processing this in a BH. >>> Creating threads immediately opens up questions of isolation, >>> cgroups etc. >>> >> Could you please share more about how creating threads affects isolation and >> cgroup? > When threads are coming and going, they are hard to control. > > Consider the rich API libvirt exposes for controlling the io threads: > > https://libvirt.org/formatdomain.html#elementsIOThreadsAllocation > > >> and how does BH solve it? Thanks. >> Best, >> Wei > It's late at night so I don't remember whether it's the emulator > or the io thread that runs the BH, but the point is it's > already controlled by libvirt. > Thanks all for reviewing the patches. I've changed to use iothread in the new version. Please have a check if that would be good enough. Best, Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* [Qemu-devel] [PATCH v5 5/5] migration: use the free page hint feature from balloon 2018-03-16 10:48 ` [virtio-dev] " Wei Wang @ 2018-03-16 10:48 ` Wei Wang -1 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel Start the free page optimization after the migration bitmap is synchronized. This can't be used in the stop© phase since the guest is paused. Make sure the guest reporting has stopped before synchronizing the migration dirty bitmap. Currently, the optimization is added to precopy only. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> --- migration/ram.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 2e82181..8589a51 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -51,6 +51,7 @@ #include "qemu/rcu_queue.h" #include "migration/colo.h" #include "migration/block.h" +#include "sysemu/balloon.h" /***********************************************************/ /* ram save/restore */ @@ -208,6 +209,8 @@ struct RAMState { uint32_t last_version; /* We are in the first round */ bool ram_bulk_stage; + /* The free pages optimization feature is supported */ + bool free_page_support; /* How many times we have dirty too many pages */ int dirty_rate_high_cnt; /* these variables are used for bitmap sync */ @@ -836,6 +839,10 @@ static void migration_bitmap_sync(RAMState *rs) int64_t end_time; uint64_t bytes_xfer_now; + if (rs->free_page_support) { + balloon_free_page_stop(); + } + ram_counters.dirty_sync_count++; if (!rs->time_last_bitmap_sync) { @@ -902,6 +909,10 @@ static void migration_bitmap_sync(RAMState *rs) if (migrate_use_events()) { qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL); } + + if (rs->free_page_support) { + balloon_free_page_start(); + } } /** @@ -1658,7 +1669,17 @@ static void ram_state_reset(RAMState *rs) rs->last_sent_block = NULL; rs->last_page = 0; rs->last_version = ram_list.version; - rs->ram_bulk_stage = true; + rs->free_page_support = balloon_free_page_support() && !migrate_postcopy(); + if (rs->free_page_support) { + /* + * When the free page optimization is used, not all the pages are + * treated as dirty pages (via migration_bitmap_find_dirty) that need + * to be sent. So disable ram_bulk_stage in this case. + */ + rs->ram_bulk_stage = false; + } else { + rs->ram_bulk_stage = true; + } } #define MAX_WAIT 50 /* ms, half buffered_file limit */ @@ -2364,6 +2385,9 @@ out: ret = qemu_file_get_error(f); if (ret < 0) { + if (rs->free_page_support) { + balloon_free_page_stop(); + } return ret; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [virtio-dev] [PATCH v5 5/5] migration: use the free page hint feature from balloon @ 2018-03-16 10:48 ` Wei Wang 0 siblings, 0 replies; 42+ messages in thread From: Wei Wang @ 2018-03-16 10:48 UTC (permalink / raw) To: qemu-devel, virtio-dev, mst, quintela, dgilbert Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel Start the free page optimization after the migration bitmap is synchronized. This can't be used in the stop© phase since the guest is paused. Make sure the guest reporting has stopped before synchronizing the migration dirty bitmap. Currently, the optimization is added to precopy only. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> --- migration/ram.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 2e82181..8589a51 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -51,6 +51,7 @@ #include "qemu/rcu_queue.h" #include "migration/colo.h" #include "migration/block.h" +#include "sysemu/balloon.h" /***********************************************************/ /* ram save/restore */ @@ -208,6 +209,8 @@ struct RAMState { uint32_t last_version; /* We are in the first round */ bool ram_bulk_stage; + /* The free pages optimization feature is supported */ + bool free_page_support; /* How many times we have dirty too many pages */ int dirty_rate_high_cnt; /* these variables are used for bitmap sync */ @@ -836,6 +839,10 @@ static void migration_bitmap_sync(RAMState *rs) int64_t end_time; uint64_t bytes_xfer_now; + if (rs->free_page_support) { + balloon_free_page_stop(); + } + ram_counters.dirty_sync_count++; if (!rs->time_last_bitmap_sync) { @@ -902,6 +909,10 @@ static void migration_bitmap_sync(RAMState *rs) if (migrate_use_events()) { qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL); } + + if (rs->free_page_support) { + balloon_free_page_start(); + } } /** @@ -1658,7 +1669,17 @@ static void ram_state_reset(RAMState *rs) rs->last_sent_block = NULL; rs->last_page = 0; rs->last_version = ram_list.version; - rs->ram_bulk_stage = true; + rs->free_page_support = balloon_free_page_support() && !migrate_postcopy(); + if (rs->free_page_support) { + /* + * When the free page optimization is used, not all the pages are + * treated as dirty pages (via migration_bitmap_find_dirty) that need + * to be sent. So disable ram_bulk_stage in this case. + */ + rs->ram_bulk_stage = false; + } else { + rs->ram_bulk_stage = true; + } } #define MAX_WAIT 50 /* ms, half buffered_file limit */ @@ -2364,6 +2385,9 @@ out: ret = qemu_file_get_error(f); if (ret < 0) { + if (rs->free_page_support) { + balloon_free_page_stop(); + } return ret; } -- 1.8.3.1 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply related [flat|nested] 42+ messages in thread
end of thread, other threads:[~2018-03-30 9:49 UTC | newest] Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-16 10:48 [Qemu-devel] [PATCH v5 0/5] virtio-balloon: free page hint reporting support Wei Wang 2018-03-16 10:48 ` [virtio-dev] " Wei Wang 2018-03-16 10:48 ` [Qemu-devel] [PATCH v5 1/5] bitmap: bitmap_count_one_with_offset Wei Wang 2018-03-16 10:48 ` [virtio-dev] " Wei Wang 2018-03-16 10:48 ` [Qemu-devel] [PATCH v5 2/5] migration: use bitmap_mutex in migration_bitmap_clear_dirty Wei Wang 2018-03-16 10:48 ` [virtio-dev] " Wei Wang 2018-03-16 10:48 ` [Qemu-devel] [PATCH v5 3/5] migration: API to clear bits of guest free pages from the dirty bitmap Wei Wang 2018-03-16 10:48 ` [virtio-dev] " Wei Wang 2018-03-16 10:48 ` [Qemu-devel] [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang 2018-03-16 10:48 ` [virtio-dev] " Wei Wang 2018-03-16 15:16 ` [Qemu-devel] " Michael S. Tsirkin 2018-03-16 15:16 ` [virtio-dev] " Michael S. Tsirkin 2018-03-18 10:36 ` [Qemu-devel] " Wei Wang 2018-03-18 10:36 ` [virtio-dev] " Wei Wang 2018-03-19 4:24 ` [Qemu-devel] " Michael S. Tsirkin 2018-03-19 4:24 ` [virtio-dev] " Michael S. Tsirkin 2018-03-19 9:01 ` [Qemu-devel] " Wei Wang 2018-03-19 9:01 ` Wei Wang 2018-03-19 22:55 ` [Qemu-devel] " Michael S. Tsirkin 2018-03-19 22:55 ` Michael S. Tsirkin 2018-03-20 2:16 ` [Qemu-devel] " Wei Wang 2018-03-20 2:16 ` Wei Wang 2018-03-20 2:59 ` [Qemu-devel] " Michael S. Tsirkin 2018-03-20 2:59 ` Michael S. Tsirkin 2018-03-20 3:18 ` [Qemu-devel] " Wei Wang 2018-03-20 3:18 ` Wei Wang 2018-03-20 3:24 ` [Qemu-devel] " Michael S. Tsirkin 2018-03-20 3:24 ` Michael S. Tsirkin 2018-03-22 3:13 ` [Qemu-devel] " Wei Wang 2018-03-22 3:13 ` Wei Wang 2018-03-26 10:58 ` [Qemu-devel] " Wei Wang 2018-03-26 10:58 ` Wei Wang 2018-03-26 11:09 ` [Qemu-devel] " Daniel P. Berrangé 2018-03-26 14:54 ` Wang, Wei W 2018-03-26 14:54 ` [virtio-dev] " Wang, Wei W 2018-03-26 15:04 ` Daniel P. Berrangé 2018-03-26 15:35 ` Wang, Wei W 2018-03-26 15:35 ` [virtio-dev] " Wang, Wei W 2018-03-30 9:52 ` Wei Wang 2018-03-30 9:52 ` Wei Wang 2018-03-16 10:48 ` [Qemu-devel] [PATCH v5 5/5] migration: use the free page hint feature from balloon Wei Wang 2018-03-16 10:48 ` [virtio-dev] " Wei Wang
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.