All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
@ 2018-02-06 11:08 ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

This is the deivce part implementation to add a new feature,
VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
receives the guest free page hints from the driver and clears the
corresponding bits in the dirty bitmap, so that those free pages are
not transferred by the migration thread to the destination.

Please see the driver patch link for test results:
https://lkml.org/lkml/2018/2/4/60

ChangeLog:
v1->v2: 
    1) virtio-balloon
        - use subsections to save free_page_report_cmd_id;
        - poll the free page vq after sending a cmd id to the driver;
        - change the free page vq size to VIRTQUEUE_MAX_SIZE;
        - virtio_balloon_poll_free_page_hints: handle the corner case
          that the free page block reported from the driver may cross
          the RAMBlock boundary.
    2) migration/ram.c
        - use balloon_free_page_poll to start the optimization

Wei Wang (3):
  virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  migration: use the free page reporting feature from balloon
  virtio-balloon: add a timer to limit the free page report waiting time

 balloon.c                                       |  39 ++--
 hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
 hw/virtio/virtio-pci.c                          |   3 +
 include/hw/virtio/virtio-balloon.h              |  15 +-
 include/migration/misc.h                        |   3 +
 include/standard-headers/linux/virtio_balloon.h |   7 +
 include/sysemu/balloon.h                        |  12 +-
 migration/ram.c                                 |  34 +++-
 8 files changed, 307 insertions(+), 33 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
@ 2018-02-06 11:08 ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

This is the deivce part implementation to add a new feature,
VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
receives the guest free page hints from the driver and clears the
corresponding bits in the dirty bitmap, so that those free pages are
not transferred by the migration thread to the destination.

Please see the driver patch link for test results:
https://lkml.org/lkml/2018/2/4/60

ChangeLog:
v1->v2: 
    1) virtio-balloon
        - use subsections to save free_page_report_cmd_id;
        - poll the free page vq after sending a cmd id to the driver;
        - change the free page vq size to VIRTQUEUE_MAX_SIZE;
        - virtio_balloon_poll_free_page_hints: handle the corner case
          that the free page block reported from the driver may cross
          the RAMBlock boundary.
    2) migration/ram.c
        - use balloon_free_page_poll to start the optimization

Wei Wang (3):
  virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  migration: use the free page reporting feature from balloon
  virtio-balloon: add a timer to limit the free page report waiting time

 balloon.c                                       |  39 ++--
 hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
 hw/virtio/virtio-pci.c                          |   3 +
 include/hw/virtio/virtio-balloon.h              |  15 +-
 include/migration/misc.h                        |   3 +
 include/standard-headers/linux/virtio_balloon.h |   7 +
 include/sysemu/balloon.h                        |  12 +-
 migration/ram.c                                 |  34 +++-
 8 files changed, 307 insertions(+), 33 deletions(-)

-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  2018-02-06 11:08 ` [virtio-dev] " Wei Wang
@ 2018-02-06 11:08   ` Wei Wang
  -1 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

The new feature enables the virtio-balloon device to receive the hint of
guest free pages from the free page vq, and clears the corresponding bits
of the free page from the dirty bitmap, so that those free pages are not
transferred by the migration thread.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Liang Li <liang.z.li@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
---
 balloon.c                                       |  39 +++++--
 hw/virtio/virtio-balloon.c                      | 145 +++++++++++++++++++++---
 include/hw/virtio/virtio-balloon.h              |  11 +-
 include/migration/misc.h                        |   3 +
 include/standard-headers/linux/virtio_balloon.h |   7 ++
 include/sysemu/balloon.h                        |  12 +-
 migration/ram.c                                 |  10 ++
 7 files changed, 198 insertions(+), 29 deletions(-)

diff --git a/balloon.c b/balloon.c
index 1d720ff..0f0b30c 100644
--- a/balloon.c
+++ b/balloon.c
@@ -36,6 +36,8 @@
 
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
+static QEMUBalloonFreePageSupport *balloon_free_page_support_fn;
+static QEMUBalloonFreePagePoll *balloon_free_page_poll_fn;
 static void *balloon_opaque;
 static bool balloon_inhibited;
 
@@ -64,19 +66,34 @@ static bool have_balloon(Error **errp)
     return true;
 }
 
-int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-                             QEMUBalloonStatus *stat_func, void *opaque)
+bool balloon_free_page_support(void)
 {
-    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
-        /* We're already registered one balloon handler.  How many can
-         * a guest really have?
-         */
-        return -1;
+    return balloon_free_page_support_fn &&
+           balloon_free_page_support_fn(balloon_opaque);
+}
+
+void balloon_free_page_poll(void)
+{
+    balloon_free_page_poll_fn(balloon_opaque);
+}
+
+void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
+                              QEMUBalloonStatus *stat_fn,
+                              QEMUBalloonFreePageSupport *free_page_support_fn,
+                              QEMUBalloonFreePagePoll *free_page_poll_fn,
+                              void *opaque)
+{
+    if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn ||
+        balloon_free_page_poll_fn || balloon_opaque) {
+        /* We already registered one balloon handler. */
+        return;
     }
-    balloon_event_fn = event_func;
-    balloon_stat_fn = stat_func;
+
+    balloon_event_fn = event_fn;
+    balloon_stat_fn = stat_fn;
+    balloon_free_page_support_fn = free_page_support_fn;
+    balloon_free_page_poll_fn = free_page_poll_fn;
     balloon_opaque = opaque;
-    return 0;
 }
 
 void qemu_remove_balloon_handler(void *opaque)
@@ -86,6 +103,8 @@ void qemu_remove_balloon_handler(void *opaque)
     }
     balloon_event_fn = NULL;
     balloon_stat_fn = NULL;
+    balloon_free_page_support_fn = NULL;
+    balloon_free_page_poll_fn = NULL;
     balloon_opaque = NULL;
 }
 
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 14e08d2..b424d4e 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -23,6 +23,7 @@
 #include "hw/virtio/virtio-balloon.h"
 #include "sysemu/kvm.h"
 #include "exec/address-spaces.h"
+#include "exec/ram_addr.h"
 #include "qapi/visitor.h"
 #include "qapi-event.h"
 #include "trace.h"
@@ -30,6 +31,7 @@
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
+#include "migration/misc.h"
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
@@ -305,6 +307,87 @@ out:
     }
 }
 
+static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
+{
+    VirtQueueElement *elem;
+    VirtQueue *vq = dev->free_page_vq;
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    bool page_poisoning = virtio_vdev_has_feature(vdev,
+                                              VIRTIO_BALLOON_F_PAGE_POISON);
+    uint32_t id;
+
+    /* Poll the vq till a stop cmd id is received */
+    while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
+        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+        if (!elem) {
+            continue;
+        }
+
+        if (elem->out_num) {
+            iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(uint32_t));
+            virtqueue_push(vq, elem, sizeof(id));
+            g_free(elem);
+            if (id == dev->free_page_report_cmd_id) {
+                dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
+            } else {
+                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+                break;
+            }
+        }
+
+        if (elem->in_num) {
+            RAMBlock *block;
+            ram_addr_t offset;
+            void *base;
+            size_t total_len, len;
+
+            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START &&
+                !page_poisoning) {
+                base = elem->in_sg[0].iov_base;
+                total_len = elem->in_sg[0].iov_len;
+                len = total_len;
+
+                while (total_len > 0) {
+                    block = qemu_ram_block_from_host(base, false, &offset);
+                    if (unlikely(offset + total_len > block->used_length)) {
+                        len = block->used_length - offset;
+                        base += len;
+                    }
+                    skip_free_pages_from_dirty_bitmap(block, offset, len);
+                    total_len -= len;
+                }
+            }
+            virtqueue_push(vq, elem, total_len);
+            g_free(elem);
+        }
+    }
+}
+
+static bool virtio_balloon_free_page_support(void *opaque)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
+}
+
+static void virtio_balloon_free_page_poll(void *opaque)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) {
+        s->free_page_report_cmd_id = 1;
+    } else {
+        s->free_page_report_cmd_id++;
+    }
+
+    virtio_notify_config(vdev);
+    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
+
+    virtio_balloon_poll_free_page_hints(s);
+}
+
 static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
@@ -312,6 +395,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
+    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
 
     trace_virtio_balloon_get_config(config.num_pages, config.actual);
     memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
@@ -365,6 +449,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
                         ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT),
                         &error_abort);
     }
+    dev->poison_val = le32_to_cpu(config.poison_val);
     trace_virtio_balloon_set_config(dev->actual, oldactual);
 }
 
@@ -374,6 +459,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
     f |= dev->host_features;
     virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
+    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);
     return f;
 }
 
@@ -410,6 +496,17 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
     return 0;
 }
 
+static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
+    .name = "virtio-balloon-device/free-page-report",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = virtio_balloon_free_page_support,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_virtio_balloon_device = {
     .name = "virtio-balloon-device",
     .version_id = 1,
@@ -420,30 +517,29 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
         VMSTATE_UINT32(actual, VirtIOBalloon),
         VMSTATE_END_OF_LIST()
     },
+    .subsections = (const VMStateDescription * []) {
+        &vmstate_virtio_balloon_free_page_report,
+        NULL
+    }
 };
 
 static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     VirtIOBalloon *s = VIRTIO_BALLOON(dev);
-    int ret;
 
     virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
                 sizeof(struct virtio_balloon_config));
 
-    ret = qemu_add_balloon_handler(virtio_balloon_to_target,
-                                   virtio_balloon_stat, s);
-
-    if (ret < 0) {
-        error_setg(errp, "Only one balloon device is supported");
-        virtio_cleanup(vdev);
-        return;
-    }
-
     s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
-
+    if (virtio_has_feature(s->host_features,
+                           VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+        s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
+        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+        s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
+    }
     reset_stats(s);
 }
 
@@ -472,11 +568,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
 
-    if (!s->stats_vq_elem && vdev->vm_running &&
-        (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) {
-        /* poll stats queue for the element we have discarded when the VM
-         * was stopped */
-        virtio_balloon_receive_stats(vdev, s->svq);
+    if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
+        if (!s->stats_vq_elem && vdev->vm_running &&
+            virtqueue_rewind(s->svq, 1)) {
+            /*
+             * Poll stats queue for the element we have discarded when the VM
+             * was stopped.
+             */
+            virtio_balloon_receive_stats(vdev, s->svq);
+        }
+
+        if (virtio_balloon_free_page_support(s)) {
+            qemu_add_balloon_handler(virtio_balloon_to_target,
+                                     virtio_balloon_stat,
+                                     virtio_balloon_free_page_support,
+                                     virtio_balloon_free_page_poll,
+                                     s);
+        } else {
+            qemu_add_balloon_handler(virtio_balloon_to_target,
+                                     virtio_balloon_stat, NULL, NULL, s);
+        }
     }
 }
 
@@ -506,6 +617,8 @@ static const VMStateDescription vmstate_virtio_balloon = {
 static Property virtio_balloon_properties[] = {
     DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
                     VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
+    DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
+                    VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 1ea13bd..11b4e01 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -31,11 +31,20 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
+enum virtio_balloon_free_page_report_status {
+    FREE_PAGE_REPORT_S_REQUESTED,
+    FREE_PAGE_REPORT_S_START,
+    FREE_PAGE_REPORT_S_STOP,
+};
+
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
-    VirtQueue *ivq, *dvq, *svq;
+    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+    uint32_t free_page_report_status;
     uint32_t num_pages;
     uint32_t actual;
+    uint32_t free_page_report_cmd_id;
+    uint32_t poison_val;
     uint64_t stats[VIRTIO_BALLOON_S_NR];
     VirtQueueElement *stats_vq_elem;
     size_t stats_vq_offset;
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 77fd4f5..6df419c 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -14,11 +14,14 @@
 #ifndef MIGRATION_MISC_H
 #define MIGRATION_MISC_H
 
+#include "exec/cpu-common.h"
 #include "qemu/notify.h"
 
 /* migration/ram.c */
 
 void ram_mig_init(void);
+void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
+                                       size_t len);
 
 /* migration/block.c */
 
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 9d06ccd..19d0d8b 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -34,15 +34,22 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
+#define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
 
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
 struct virtio_balloon_config {
 	/* Number of pages host wants Guest to give up. */
 	uint32_t num_pages;
 	/* Number of pages we've actually got in balloon. */
 	uint32_t actual;
+	/* Free page report command id, readonly by guest */
+	uint32_t free_page_report_cmd_id;
+	/* Stores PAGE_POISON if page poisoning is in use */
+	uint32_t poison_val;
 };
 
 #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index af49e19..53db4dc 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -18,11 +18,19 @@
 
 typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
 typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
+typedef bool (QEMUBalloonFreePageSupport)(void *opaque);
+typedef void (QEMUBalloonFreePagePoll)(void *opaque);
 
-int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-			     QEMUBalloonStatus *stat_func, void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
 bool qemu_balloon_is_inhibited(void);
 void qemu_balloon_inhibit(bool state);
+bool balloon_free_page_support(void);
+void balloon_free_page_poll(void);
+
+void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
+                              QEMUBalloonStatus *stat_fn,
+                              QEMUBalloonFreePageSupport *free_page_support_fn,
+                              QEMUBalloonFreePagePoll *free_page_poll_fn,
+                              void *opaque);
 
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index cb1950f..d6f462c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2186,6 +2186,16 @@ static int ram_init_all(RAMState **rsp)
     return 0;
 }
 
+void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
+                                       size_t len)
+{
+    long start = offset >> TARGET_PAGE_BITS,
+         nr = len >> TARGET_PAGE_BITS;
+
+    bitmap_clear(block->bmap, start, nr);
+    ram_state->migration_dirty_pages -= nr;
+}
+
 /*
  * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
  * long-running RCU critical section.  When rcu-reclaims in the code
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [virtio-dev] [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
@ 2018-02-06 11:08   ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

The new feature enables the virtio-balloon device to receive the hint of
guest free pages from the free page vq, and clears the corresponding bits
of the free page from the dirty bitmap, so that those free pages are not
transferred by the migration thread.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Liang Li <liang.z.li@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
---
 balloon.c                                       |  39 +++++--
 hw/virtio/virtio-balloon.c                      | 145 +++++++++++++++++++++---
 include/hw/virtio/virtio-balloon.h              |  11 +-
 include/migration/misc.h                        |   3 +
 include/standard-headers/linux/virtio_balloon.h |   7 ++
 include/sysemu/balloon.h                        |  12 +-
 migration/ram.c                                 |  10 ++
 7 files changed, 198 insertions(+), 29 deletions(-)

diff --git a/balloon.c b/balloon.c
index 1d720ff..0f0b30c 100644
--- a/balloon.c
+++ b/balloon.c
@@ -36,6 +36,8 @@
 
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
+static QEMUBalloonFreePageSupport *balloon_free_page_support_fn;
+static QEMUBalloonFreePagePoll *balloon_free_page_poll_fn;
 static void *balloon_opaque;
 static bool balloon_inhibited;
 
@@ -64,19 +66,34 @@ static bool have_balloon(Error **errp)
     return true;
 }
 
-int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-                             QEMUBalloonStatus *stat_func, void *opaque)
+bool balloon_free_page_support(void)
 {
-    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
-        /* We're already registered one balloon handler.  How many can
-         * a guest really have?
-         */
-        return -1;
+    return balloon_free_page_support_fn &&
+           balloon_free_page_support_fn(balloon_opaque);
+}
+
+void balloon_free_page_poll(void)
+{
+    balloon_free_page_poll_fn(balloon_opaque);
+}
+
+void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
+                              QEMUBalloonStatus *stat_fn,
+                              QEMUBalloonFreePageSupport *free_page_support_fn,
+                              QEMUBalloonFreePagePoll *free_page_poll_fn,
+                              void *opaque)
+{
+    if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn ||
+        balloon_free_page_poll_fn || balloon_opaque) {
+        /* We already registered one balloon handler. */
+        return;
     }
-    balloon_event_fn = event_func;
-    balloon_stat_fn = stat_func;
+
+    balloon_event_fn = event_fn;
+    balloon_stat_fn = stat_fn;
+    balloon_free_page_support_fn = free_page_support_fn;
+    balloon_free_page_poll_fn = free_page_poll_fn;
     balloon_opaque = opaque;
-    return 0;
 }
 
 void qemu_remove_balloon_handler(void *opaque)
@@ -86,6 +103,8 @@ void qemu_remove_balloon_handler(void *opaque)
     }
     balloon_event_fn = NULL;
     balloon_stat_fn = NULL;
+    balloon_free_page_support_fn = NULL;
+    balloon_free_page_poll_fn = NULL;
     balloon_opaque = NULL;
 }
 
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 14e08d2..b424d4e 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -23,6 +23,7 @@
 #include "hw/virtio/virtio-balloon.h"
 #include "sysemu/kvm.h"
 #include "exec/address-spaces.h"
+#include "exec/ram_addr.h"
 #include "qapi/visitor.h"
 #include "qapi-event.h"
 #include "trace.h"
@@ -30,6 +31,7 @@
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
+#include "migration/misc.h"
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
@@ -305,6 +307,87 @@ out:
     }
 }
 
+static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
+{
+    VirtQueueElement *elem;
+    VirtQueue *vq = dev->free_page_vq;
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    bool page_poisoning = virtio_vdev_has_feature(vdev,
+                                              VIRTIO_BALLOON_F_PAGE_POISON);
+    uint32_t id;
+
+    /* Poll the vq till a stop cmd id is received */
+    while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
+        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+        if (!elem) {
+            continue;
+        }
+
+        if (elem->out_num) {
+            iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(uint32_t));
+            virtqueue_push(vq, elem, sizeof(id));
+            g_free(elem);
+            if (id == dev->free_page_report_cmd_id) {
+                dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
+            } else {
+                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+                break;
+            }
+        }
+
+        if (elem->in_num) {
+            RAMBlock *block;
+            ram_addr_t offset;
+            void *base;
+            size_t total_len, len;
+
+            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START &&
+                !page_poisoning) {
+                base = elem->in_sg[0].iov_base;
+                total_len = elem->in_sg[0].iov_len;
+                len = total_len;
+
+                while (total_len > 0) {
+                    block = qemu_ram_block_from_host(base, false, &offset);
+                    if (unlikely(offset + total_len > block->used_length)) {
+                        len = block->used_length - offset;
+                        base += len;
+                    }
+                    skip_free_pages_from_dirty_bitmap(block, offset, len);
+                    total_len -= len;
+                }
+            }
+            virtqueue_push(vq, elem, total_len);
+            g_free(elem);
+        }
+    }
+}
+
+static bool virtio_balloon_free_page_support(void *opaque)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
+}
+
+static void virtio_balloon_free_page_poll(void *opaque)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) {
+        s->free_page_report_cmd_id = 1;
+    } else {
+        s->free_page_report_cmd_id++;
+    }
+
+    virtio_notify_config(vdev);
+    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
+
+    virtio_balloon_poll_free_page_hints(s);
+}
+
 static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
@@ -312,6 +395,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
+    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
 
     trace_virtio_balloon_get_config(config.num_pages, config.actual);
     memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
@@ -365,6 +449,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
                         ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT),
                         &error_abort);
     }
+    dev->poison_val = le32_to_cpu(config.poison_val);
     trace_virtio_balloon_set_config(dev->actual, oldactual);
 }
 
@@ -374,6 +459,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
     f |= dev->host_features;
     virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
+    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);
     return f;
 }
 
@@ -410,6 +496,17 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
     return 0;
 }
 
+static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
+    .name = "virtio-balloon-device/free-page-report",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = virtio_balloon_free_page_support,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_virtio_balloon_device = {
     .name = "virtio-balloon-device",
     .version_id = 1,
@@ -420,30 +517,29 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
         VMSTATE_UINT32(actual, VirtIOBalloon),
         VMSTATE_END_OF_LIST()
     },
+    .subsections = (const VMStateDescription * []) {
+        &vmstate_virtio_balloon_free_page_report,
+        NULL
+    }
 };
 
 static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     VirtIOBalloon *s = VIRTIO_BALLOON(dev);
-    int ret;
 
     virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
                 sizeof(struct virtio_balloon_config));
 
-    ret = qemu_add_balloon_handler(virtio_balloon_to_target,
-                                   virtio_balloon_stat, s);
-
-    if (ret < 0) {
-        error_setg(errp, "Only one balloon device is supported");
-        virtio_cleanup(vdev);
-        return;
-    }
-
     s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
-
+    if (virtio_has_feature(s->host_features,
+                           VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+        s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
+        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+        s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
+    }
     reset_stats(s);
 }
 
@@ -472,11 +568,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
 
-    if (!s->stats_vq_elem && vdev->vm_running &&
-        (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) {
-        /* poll stats queue for the element we have discarded when the VM
-         * was stopped */
-        virtio_balloon_receive_stats(vdev, s->svq);
+    if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
+        if (!s->stats_vq_elem && vdev->vm_running &&
+            virtqueue_rewind(s->svq, 1)) {
+            /*
+             * Poll stats queue for the element we have discarded when the VM
+             * was stopped.
+             */
+            virtio_balloon_receive_stats(vdev, s->svq);
+        }
+
+        if (virtio_balloon_free_page_support(s)) {
+            qemu_add_balloon_handler(virtio_balloon_to_target,
+                                     virtio_balloon_stat,
+                                     virtio_balloon_free_page_support,
+                                     virtio_balloon_free_page_poll,
+                                     s);
+        } else {
+            qemu_add_balloon_handler(virtio_balloon_to_target,
+                                     virtio_balloon_stat, NULL, NULL, s);
+        }
     }
 }
 
@@ -506,6 +617,8 @@ static const VMStateDescription vmstate_virtio_balloon = {
 static Property virtio_balloon_properties[] = {
     DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
                     VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
+    DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
+                    VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 1ea13bd..11b4e01 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -31,11 +31,20 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
+enum virtio_balloon_free_page_report_status {
+    FREE_PAGE_REPORT_S_REQUESTED,
+    FREE_PAGE_REPORT_S_START,
+    FREE_PAGE_REPORT_S_STOP,
+};
+
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
-    VirtQueue *ivq, *dvq, *svq;
+    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+    uint32_t free_page_report_status;
     uint32_t num_pages;
     uint32_t actual;
+    uint32_t free_page_report_cmd_id;
+    uint32_t poison_val;
     uint64_t stats[VIRTIO_BALLOON_S_NR];
     VirtQueueElement *stats_vq_elem;
     size_t stats_vq_offset;
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 77fd4f5..6df419c 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -14,11 +14,14 @@
 #ifndef MIGRATION_MISC_H
 #define MIGRATION_MISC_H
 
+#include "exec/cpu-common.h"
 #include "qemu/notify.h"
 
 /* migration/ram.c */
 
 void ram_mig_init(void);
+void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
+                                       size_t len);
 
 /* migration/block.c */
 
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 9d06ccd..19d0d8b 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -34,15 +34,22 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
+#define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
 
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
 struct virtio_balloon_config {
 	/* Number of pages host wants Guest to give up. */
 	uint32_t num_pages;
 	/* Number of pages we've actually got in balloon. */
 	uint32_t actual;
+	/* Free page report command id, readonly by guest */
+	uint32_t free_page_report_cmd_id;
+	/* Stores PAGE_POISON if page poisoning is in use */
+	uint32_t poison_val;
 };
 
 #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index af49e19..53db4dc 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -18,11 +18,19 @@
 
 typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
 typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
+typedef bool (QEMUBalloonFreePageSupport)(void *opaque);
+typedef void (QEMUBalloonFreePagePoll)(void *opaque);
 
-int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-			     QEMUBalloonStatus *stat_func, void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
 bool qemu_balloon_is_inhibited(void);
 void qemu_balloon_inhibit(bool state);
+bool balloon_free_page_support(void);
+void balloon_free_page_poll(void);
+
+void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
+                              QEMUBalloonStatus *stat_fn,
+                              QEMUBalloonFreePageSupport *free_page_support_fn,
+                              QEMUBalloonFreePagePoll *free_page_poll_fn,
+                              void *opaque);
 
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index cb1950f..d6f462c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2186,6 +2186,16 @@ static int ram_init_all(RAMState **rsp)
     return 0;
 }
 
+void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
+                                       size_t len)
+{
+    long start = offset >> TARGET_PAGE_BITS,
+         nr = len >> TARGET_PAGE_BITS;
+
+    bitmap_clear(block->bmap, start, nr);
+    ram_state->migration_dirty_pages -= nr;
+}
+
 /*
  * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
  * long-running RCU critical section.  When rcu-reclaims in the code
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
  2018-02-06 11:08 ` [virtio-dev] " Wei Wang
@ 2018-02-06 11:08   ` Wei Wang
  -1 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

Use the free page reporting feature from the balloon device to clear the
bits corresponding to guest free pages from the dirty bitmap, so that the
free memory are not sent.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index d6f462c..4fe16d2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -49,6 +49,7 @@
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
 #include "migration/block.h"
+#include "sysemu/balloon.h"
 
 /***********************************************************/
 /* ram save/restore */
@@ -206,6 +207,10 @@ struct RAMState {
     uint32_t last_version;
     /* We are in the first round */
     bool ram_bulk_stage;
+    /* The feature, skipping the transfer of free pages, is supported */
+    bool free_page_support;
+    /* Skip the transfer of free pages in the bulk stage */
+    bool free_page_done;
     /* How many times we have dirty too many pages */
     int dirty_rate_high_cnt;
     /* these variables are used for bitmap sync */
@@ -773,7 +778,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
     unsigned long *bitmap = rb->bmap;
     unsigned long next;
 
-    if (rs->ram_bulk_stage && start > 0) {
+    if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
         next = start + 1;
     } else {
         next = find_next_bit(bitmap, size, start);
@@ -1653,6 +1658,8 @@ static void ram_state_reset(RAMState *rs)
     rs->last_page = 0;
     rs->last_version = ram_list.version;
     rs->ram_bulk_stage = true;
+    rs->free_page_support = balloon_free_page_support();
+    rs->free_page_done = false;
 }
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -2135,7 +2142,7 @@ static int ram_state_init(RAMState **rsp)
     return 0;
 }
 
-static void ram_list_init_bitmaps(void)
+static void ram_list_init_bitmaps(RAMState *rs)
 {
     RAMBlock *block;
     unsigned long pages;
@@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
         QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
             pages = block->max_length >> TARGET_PAGE_BITS;
             block->bmap = bitmap_new(pages);
-            bitmap_set(block->bmap, 0, pages);
+            if (rs->free_page_support) {
+                bitmap_set(block->bmap, 1, pages);
+            } else {
+                bitmap_set(block->bmap, 0, pages);
+            }
             if (migrate_postcopy_ram()) {
                 block->unsentmap = bitmap_new(pages);
                 bitmap_set(block->unsentmap, 0, pages);
@@ -2161,7 +2172,7 @@ static void ram_init_bitmaps(RAMState *rs)
     qemu_mutex_lock_ramlist();
     rcu_read_lock();
 
-    ram_list_init_bitmaps();
+    ram_list_init_bitmaps(rs);
     memory_global_dirty_log_start();
     migration_bitmap_sync(rs);
 
@@ -2275,6 +2286,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 
     ram_control_before_iterate(f, RAM_CONTROL_ROUND);
 
+    if (rs->free_page_support && !rs->free_page_done) {
+        balloon_free_page_poll();
+        rs->free_page_done = true;
+    }
+
     t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     i = 0;
     while ((ret = qemu_file_rate_limit(f)) == 0) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [virtio-dev] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
@ 2018-02-06 11:08   ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

Use the free page reporting feature from the balloon device to clear the
bits corresponding to guest free pages from the dirty bitmap, so that the
free memory are not sent.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index d6f462c..4fe16d2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -49,6 +49,7 @@
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
 #include "migration/block.h"
+#include "sysemu/balloon.h"
 
 /***********************************************************/
 /* ram save/restore */
@@ -206,6 +207,10 @@ struct RAMState {
     uint32_t last_version;
     /* We are in the first round */
     bool ram_bulk_stage;
+    /* The feature, skipping the transfer of free pages, is supported */
+    bool free_page_support;
+    /* Skip the transfer of free pages in the bulk stage */
+    bool free_page_done;
     /* How many times we have dirty too many pages */
     int dirty_rate_high_cnt;
     /* these variables are used for bitmap sync */
@@ -773,7 +778,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
     unsigned long *bitmap = rb->bmap;
     unsigned long next;
 
-    if (rs->ram_bulk_stage && start > 0) {
+    if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
         next = start + 1;
     } else {
         next = find_next_bit(bitmap, size, start);
@@ -1653,6 +1658,8 @@ static void ram_state_reset(RAMState *rs)
     rs->last_page = 0;
     rs->last_version = ram_list.version;
     rs->ram_bulk_stage = true;
+    rs->free_page_support = balloon_free_page_support();
+    rs->free_page_done = false;
 }
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -2135,7 +2142,7 @@ static int ram_state_init(RAMState **rsp)
     return 0;
 }
 
-static void ram_list_init_bitmaps(void)
+static void ram_list_init_bitmaps(RAMState *rs)
 {
     RAMBlock *block;
     unsigned long pages;
@@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
         QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
             pages = block->max_length >> TARGET_PAGE_BITS;
             block->bmap = bitmap_new(pages);
-            bitmap_set(block->bmap, 0, pages);
+            if (rs->free_page_support) {
+                bitmap_set(block->bmap, 1, pages);
+            } else {
+                bitmap_set(block->bmap, 0, pages);
+            }
             if (migrate_postcopy_ram()) {
                 block->unsentmap = bitmap_new(pages);
                 bitmap_set(block->unsentmap, 0, pages);
@@ -2161,7 +2172,7 @@ static void ram_init_bitmaps(RAMState *rs)
     qemu_mutex_lock_ramlist();
     rcu_read_lock();
 
-    ram_list_init_bitmaps();
+    ram_list_init_bitmaps(rs);
     memory_global_dirty_log_start();
     migration_bitmap_sync(rs);
 
@@ -2275,6 +2286,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 
     ram_control_before_iterate(f, RAM_CONTROL_ROUND);
 
+    if (rs->free_page_support && !rs->free_page_done) {
+        balloon_free_page_poll();
+        rs->free_page_done = true;
+    }
+
     t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     i = 0;
     while ((ret = qemu_file_rate_limit(f)) == 0) {
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-06 11:08 ` [virtio-dev] " Wei Wang
@ 2018-02-06 11:08   ` Wei Wang
  -1 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

This patch adds a timer to limit the time that host waits for the free
page hints reported by the guest. Users can specify the time in ms via
"free-page-wait-time" command line option. If a user doesn't specify a
time, host waits till the guest finishes reporting all the free page
hints. The policy (wait for all the free page hints to be reported or
use a time limit) is determined by the orchestration layer.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c         | 84 +++++++++++++++++++++++++++++++++++++-
 hw/virtio/virtio-pci.c             |  3 ++
 include/hw/virtio/virtio-balloon.h |  4 ++
 3 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index b424d4e..9ee0de4 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -207,6 +207,65 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
     balloon_stats_change_timer(s, 0);
 }
 
+static void balloon_free_page_change_timer(VirtIOBalloon *s, int64_t ms)
+{
+    timer_mod(s->free_page_timer,
+              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + ms);
+}
+
+static void balloon_stop_free_page_report(void *opaque)
+{
+    VirtIOBalloon *dev = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+
+    timer_del(dev->free_page_timer);
+    timer_free(dev->free_page_timer);
+    dev->free_page_timer = NULL;
+
+    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+        dev->host_stop_free_page = true;
+        virtio_notify_config(vdev);
+    }
+}
+
+static void balloon_free_page_get_wait_time(Object *obj, Visitor *v,
+                                            const char *name, void *opaque,
+                                            Error **errp)
+{
+    VirtIOBalloon *s = opaque;
+
+    visit_type_int(v, name, &s->free_page_wait_time, errp);
+}
+
+static void balloon_free_page_set_wait_time(Object *obj, Visitor *v,
+                                            const char *name, void *opaque,
+                                            Error **errp)
+{
+    VirtIOBalloon *s = opaque;
+    Error *local_err = NULL;
+    int64_t value;
+
+    visit_type_int(v, name, &value, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    if (value < 0) {
+        error_setg(errp, "free page wait time must be greater than zero");
+        return;
+    }
+
+    if (value > UINT32_MAX) {
+        error_setg(errp, "free page wait time value is too big");
+        return;
+    }
+
+    s->free_page_wait_time = value;
+    g_assert(s->free_page_timer == NULL);
+    s->free_page_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
+                                      balloon_stop_free_page_report, s);
+}
+
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
@@ -330,6 +389,7 @@ static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
             if (id == dev->free_page_report_cmd_id) {
                 dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
             } else {
+                dev->host_stop_free_page = false;
                 dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
                 break;
             }
@@ -385,6 +445,10 @@ static void virtio_balloon_free_page_poll(void *opaque)
     virtio_notify_config(vdev);
     s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
 
+    if (s->free_page_wait_time) {
+        balloon_free_page_change_timer(s, s->free_page_wait_time);
+    }
+
     virtio_balloon_poll_free_page_hints(s);
 }
 
@@ -395,7 +459,19 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
-    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
+    if (dev->host_stop_free_page) {
+        /*
+         * Host is actively requesting to stop the free page reporting, send
+         * the stop sign to the guest. This happens when the migration thread
+         * has reached the phase to send pages to the destination while the
+         * guest hasn't done the reporting.
+         */
+        config.free_page_report_cmd_id =
+                                    VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
+    } else {
+        config.free_page_report_cmd_id =
+                                    cpu_to_le32(dev->free_page_report_cmd_id);
+    }
 
     trace_virtio_balloon_get_config(config.num_pages, config.actual);
     memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
@@ -539,6 +615,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
         s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
         s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
         s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
+        s->host_stop_free_page = false;
     }
     reset_stats(s);
 }
@@ -602,6 +679,11 @@ static void virtio_balloon_instance_init(Object *obj)
                         balloon_stats_get_poll_interval,
                         balloon_stats_set_poll_interval,
                         NULL, s, NULL);
+
+    object_property_add(obj, "free-page-wait-time", "int",
+                        balloon_free_page_get_wait_time,
+                        balloon_free_page_set_wait_time,
+                        NULL, s, NULL);
 }
 
 static const VMStateDescription vmstate_virtio_balloon = {
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 6c75cca..3345104 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2250,6 +2250,9 @@ static void virtio_balloon_pci_instance_init(Object *obj)
     object_property_add_alias(obj, "guest-stats-polling-interval",
                               OBJECT(&dev->vdev),
                               "guest-stats-polling-interval", &error_abort);
+    object_property_add_alias(obj, "free-page-wait-time",
+                              OBJECT(&dev->vdev),
+                              "free-page-wait-time", &error_abort);
 }
 
 static const TypeInfo virtio_balloon_pci_info = {
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 11b4e01..c16855b 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -40,6 +40,8 @@ enum virtio_balloon_free_page_report_status {
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
     VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+    /* Host is requesting the guest to stop free page reporting */
+    bool host_stop_free_page;
     uint32_t free_page_report_status;
     uint32_t num_pages;
     uint32_t actual;
@@ -49,8 +51,10 @@ typedef struct VirtIOBalloon {
     VirtQueueElement *stats_vq_elem;
     size_t stats_vq_offset;
     QEMUTimer *stats_timer;
+    QEMUTimer *free_page_timer;
     int64_t stats_last_update;
     int64_t stats_poll_interval;
+    int64_t free_page_wait_time;
     uint32_t host_features;
 } VirtIOBalloon;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [virtio-dev] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
@ 2018-02-06 11:08   ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-06 11:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: pbonzini, wei.w.wang, liliang.opensource, yang.zhang.wz,
	quan.xu0, nilal, riel

This patch adds a timer to limit the time that host waits for the free
page hints reported by the guest. Users can specify the time in ms via
"free-page-wait-time" command line option. If a user doesn't specify a
time, host waits till the guest finishes reporting all the free page
hints. The policy (wait for all the free page hints to be reported or
use a time limit) is determined by the orchestration layer.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c         | 84 +++++++++++++++++++++++++++++++++++++-
 hw/virtio/virtio-pci.c             |  3 ++
 include/hw/virtio/virtio-balloon.h |  4 ++
 3 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index b424d4e..9ee0de4 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -207,6 +207,65 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
     balloon_stats_change_timer(s, 0);
 }
 
+static void balloon_free_page_change_timer(VirtIOBalloon *s, int64_t ms)
+{
+    timer_mod(s->free_page_timer,
+              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + ms);
+}
+
+static void balloon_stop_free_page_report(void *opaque)
+{
+    VirtIOBalloon *dev = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+
+    timer_del(dev->free_page_timer);
+    timer_free(dev->free_page_timer);
+    dev->free_page_timer = NULL;
+
+    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+        dev->host_stop_free_page = true;
+        virtio_notify_config(vdev);
+    }
+}
+
+static void balloon_free_page_get_wait_time(Object *obj, Visitor *v,
+                                            const char *name, void *opaque,
+                                            Error **errp)
+{
+    VirtIOBalloon *s = opaque;
+
+    visit_type_int(v, name, &s->free_page_wait_time, errp);
+}
+
+static void balloon_free_page_set_wait_time(Object *obj, Visitor *v,
+                                            const char *name, void *opaque,
+                                            Error **errp)
+{
+    VirtIOBalloon *s = opaque;
+    Error *local_err = NULL;
+    int64_t value;
+
+    visit_type_int(v, name, &value, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    if (value < 0) {
+        error_setg(errp, "free page wait time must be greater than zero");
+        return;
+    }
+
+    if (value > UINT32_MAX) {
+        error_setg(errp, "free page wait time value is too big");
+        return;
+    }
+
+    s->free_page_wait_time = value;
+    g_assert(s->free_page_timer == NULL);
+    s->free_page_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
+                                      balloon_stop_free_page_report, s);
+}
+
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
@@ -330,6 +389,7 @@ static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
             if (id == dev->free_page_report_cmd_id) {
                 dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
             } else {
+                dev->host_stop_free_page = false;
                 dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
                 break;
             }
@@ -385,6 +445,10 @@ static void virtio_balloon_free_page_poll(void *opaque)
     virtio_notify_config(vdev);
     s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
 
+    if (s->free_page_wait_time) {
+        balloon_free_page_change_timer(s, s->free_page_wait_time);
+    }
+
     virtio_balloon_poll_free_page_hints(s);
 }
 
@@ -395,7 +459,19 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
-    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
+    if (dev->host_stop_free_page) {
+        /*
+         * Host is actively requesting to stop the free page reporting, send
+         * the stop sign to the guest. This happens when the migration thread
+         * has reached the phase to send pages to the destination while the
+         * guest hasn't done the reporting.
+         */
+        config.free_page_report_cmd_id =
+                                    VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
+    } else {
+        config.free_page_report_cmd_id =
+                                    cpu_to_le32(dev->free_page_report_cmd_id);
+    }
 
     trace_virtio_balloon_get_config(config.num_pages, config.actual);
     memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
@@ -539,6 +615,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
         s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
         s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
         s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
+        s->host_stop_free_page = false;
     }
     reset_stats(s);
 }
@@ -602,6 +679,11 @@ static void virtio_balloon_instance_init(Object *obj)
                         balloon_stats_get_poll_interval,
                         balloon_stats_set_poll_interval,
                         NULL, s, NULL);
+
+    object_property_add(obj, "free-page-wait-time", "int",
+                        balloon_free_page_get_wait_time,
+                        balloon_free_page_set_wait_time,
+                        NULL, s, NULL);
 }
 
 static const VMStateDescription vmstate_virtio_balloon = {
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 6c75cca..3345104 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2250,6 +2250,9 @@ static void virtio_balloon_pci_instance_init(Object *obj)
     object_property_add_alias(obj, "guest-stats-polling-interval",
                               OBJECT(&dev->vdev),
                               "guest-stats-polling-interval", &error_abort);
+    object_property_add_alias(obj, "free-page-wait-time",
+                              OBJECT(&dev->vdev),
+                              "free-page-wait-time", &error_abort);
 }
 
 static const TypeInfo virtio_balloon_pci_info = {
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 11b4e01..c16855b 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -40,6 +40,8 @@ enum virtio_balloon_free_page_report_status {
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
     VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+    /* Host is requesting the guest to stop free page reporting */
+    bool host_stop_free_page;
     uint32_t free_page_report_status;
     uint32_t num_pages;
     uint32_t actual;
@@ -49,8 +51,10 @@ typedef struct VirtIOBalloon {
     VirtQueueElement *stats_vq_elem;
     size_t stats_vq_offset;
     QEMUTimer *stats_timer;
+    QEMUTimer *free_page_timer;
     int64_t stats_last_update;
     int64_t stats_poll_interval;
+    int64_t free_page_wait_time;
     uint32_t host_features;
 } VirtIOBalloon;
 
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-06 11:08   ` [virtio-dev] " Wei Wang
@ 2018-02-06 23:43     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-06 23:43 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:19PM +0800, Wei Wang wrote:
> This patch adds a timer to limit the time that host waits for the free
> page hints reported by the guest. Users can specify the time in ms via
> "free-page-wait-time" command line option. If a user doesn't specify a
> time, host waits till the guest finishes reporting all the free page
> hints. The policy (wait for all the free page hints to be reported or
> use a time limit) is determined by the orchestration layer.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>

Looks like an option the migration command should get,
as opposed to a device feature.

> ---
>  hw/virtio/virtio-balloon.c         | 84 +++++++++++++++++++++++++++++++++++++-
>  hw/virtio/virtio-pci.c             |  3 ++
>  include/hw/virtio/virtio-balloon.h |  4 ++
>  3 files changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index b424d4e..9ee0de4 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -207,6 +207,65 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
>      balloon_stats_change_timer(s, 0);
>  }
>  
> +static void balloon_free_page_change_timer(VirtIOBalloon *s, int64_t ms)
> +{
> +    timer_mod(s->free_page_timer,
> +              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + ms);
> +}
> +
> +static void balloon_stop_free_page_report(void *opaque)
> +{
> +    VirtIOBalloon *dev = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +
> +    timer_del(dev->free_page_timer);
> +    timer_free(dev->free_page_timer);
> +    dev->free_page_timer = NULL;
> +
> +    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
> +        dev->host_stop_free_page = true;
> +        virtio_notify_config(vdev);
> +    }
> +}
> +
> +static void balloon_free_page_get_wait_time(Object *obj, Visitor *v,
> +                                            const char *name, void *opaque,
> +                                            Error **errp)
> +{
> +    VirtIOBalloon *s = opaque;
> +
> +    visit_type_int(v, name, &s->free_page_wait_time, errp);
> +}
> +
> +static void balloon_free_page_set_wait_time(Object *obj, Visitor *v,
> +                                            const char *name, void *opaque,
> +                                            Error **errp)
> +{
> +    VirtIOBalloon *s = opaque;
> +    Error *local_err = NULL;
> +    int64_t value;
> +
> +    visit_type_int(v, name, &value, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    if (value < 0) {
> +        error_setg(errp, "free page wait time must be greater than zero");
> +        return;
> +    }
> +
> +    if (value > UINT32_MAX) {
> +        error_setg(errp, "free page wait time value is too big");
> +        return;
> +    }
> +
> +    s->free_page_wait_time = value;
> +    g_assert(s->free_page_timer == NULL);
> +    s->free_page_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
> +                                      balloon_stop_free_page_report, s);
> +}
> +
>  static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>  {
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> @@ -330,6 +389,7 @@ static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
>              if (id == dev->free_page_report_cmd_id) {
>                  dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
>              } else {
> +                dev->host_stop_free_page = false;
>                  dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>                  break;
>              }
> @@ -385,6 +445,10 @@ static void virtio_balloon_free_page_poll(void *opaque)
>      virtio_notify_config(vdev);
>      s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
>  
> +    if (s->free_page_wait_time) {
> +        balloon_free_page_change_timer(s, s->free_page_wait_time);
> +    }
> +
>      virtio_balloon_poll_free_page_hints(s);
>  }
>  
> @@ -395,7 +459,19 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  
>      config.num_pages = cpu_to_le32(dev->num_pages);
>      config.actual = cpu_to_le32(dev->actual);
> -    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
> +    if (dev->host_stop_free_page) {
> +        /*
> +         * Host is actively requesting to stop the free page reporting, send
> +         * the stop sign to the guest. This happens when the migration thread
> +         * has reached the phase to send pages to the destination while the
> +         * guest hasn't done the reporting.
> +         */
> +        config.free_page_report_cmd_id =
> +                                    VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +    } else {
> +        config.free_page_report_cmd_id =
> +                                    cpu_to_le32(dev->free_page_report_cmd_id);
> +    }
>  
>      trace_virtio_balloon_get_config(config.num_pages, config.actual);
>      memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
> @@ -539,6 +615,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>          s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
>          s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>          s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +        s->host_stop_free_page = false;
>      }
>      reset_stats(s);
>  }
> @@ -602,6 +679,11 @@ static void virtio_balloon_instance_init(Object *obj)
>                          balloon_stats_get_poll_interval,
>                          balloon_stats_set_poll_interval,
>                          NULL, s, NULL);
> +
> +    object_property_add(obj, "free-page-wait-time", "int",
> +                        balloon_free_page_get_wait_time,
> +                        balloon_free_page_set_wait_time,
> +                        NULL, s, NULL);
>  }
>  
>  static const VMStateDescription vmstate_virtio_balloon = {
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 6c75cca..3345104 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -2250,6 +2250,9 @@ static void virtio_balloon_pci_instance_init(Object *obj)
>      object_property_add_alias(obj, "guest-stats-polling-interval",
>                                OBJECT(&dev->vdev),
>                                "guest-stats-polling-interval", &error_abort);
> +    object_property_add_alias(obj, "free-page-wait-time",
> +                              OBJECT(&dev->vdev),
> +                              "free-page-wait-time", &error_abort);
>  }
>  
>  static const TypeInfo virtio_balloon_pci_info = {
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 11b4e01..c16855b 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -40,6 +40,8 @@ enum virtio_balloon_free_page_report_status {
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
>      VirtQueue *ivq, *dvq, *svq, *free_page_vq;
> +    /* Host is requesting the guest to stop free page reporting */
> +    bool host_stop_free_page;
>      uint32_t free_page_report_status;
>      uint32_t num_pages;
>      uint32_t actual;
> @@ -49,8 +51,10 @@ typedef struct VirtIOBalloon {
>      VirtQueueElement *stats_vq_elem;
>      size_t stats_vq_offset;
>      QEMUTimer *stats_timer;
> +    QEMUTimer *free_page_timer;
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
> +    int64_t free_page_wait_time;
>      uint32_t host_features;
>  } VirtIOBalloon;
>  
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
@ 2018-02-06 23:43     ` Michael S. Tsirkin
  0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-06 23:43 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:19PM +0800, Wei Wang wrote:
> This patch adds a timer to limit the time that host waits for the free
> page hints reported by the guest. Users can specify the time in ms via
> "free-page-wait-time" command line option. If a user doesn't specify a
> time, host waits till the guest finishes reporting all the free page
> hints. The policy (wait for all the free page hints to be reported or
> use a time limit) is determined by the orchestration layer.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>

Looks like an option the migration command should get,
as opposed to a device feature.

> ---
>  hw/virtio/virtio-balloon.c         | 84 +++++++++++++++++++++++++++++++++++++-
>  hw/virtio/virtio-pci.c             |  3 ++
>  include/hw/virtio/virtio-balloon.h |  4 ++
>  3 files changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index b424d4e..9ee0de4 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -207,6 +207,65 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
>      balloon_stats_change_timer(s, 0);
>  }
>  
> +static void balloon_free_page_change_timer(VirtIOBalloon *s, int64_t ms)
> +{
> +    timer_mod(s->free_page_timer,
> +              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + ms);
> +}
> +
> +static void balloon_stop_free_page_report(void *opaque)
> +{
> +    VirtIOBalloon *dev = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +
> +    timer_del(dev->free_page_timer);
> +    timer_free(dev->free_page_timer);
> +    dev->free_page_timer = NULL;
> +
> +    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
> +        dev->host_stop_free_page = true;
> +        virtio_notify_config(vdev);
> +    }
> +}
> +
> +static void balloon_free_page_get_wait_time(Object *obj, Visitor *v,
> +                                            const char *name, void *opaque,
> +                                            Error **errp)
> +{
> +    VirtIOBalloon *s = opaque;
> +
> +    visit_type_int(v, name, &s->free_page_wait_time, errp);
> +}
> +
> +static void balloon_free_page_set_wait_time(Object *obj, Visitor *v,
> +                                            const char *name, void *opaque,
> +                                            Error **errp)
> +{
> +    VirtIOBalloon *s = opaque;
> +    Error *local_err = NULL;
> +    int64_t value;
> +
> +    visit_type_int(v, name, &value, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    if (value < 0) {
> +        error_setg(errp, "free page wait time must be greater than zero");
> +        return;
> +    }
> +
> +    if (value > UINT32_MAX) {
> +        error_setg(errp, "free page wait time value is too big");
> +        return;
> +    }
> +
> +    s->free_page_wait_time = value;
> +    g_assert(s->free_page_timer == NULL);
> +    s->free_page_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
> +                                      balloon_stop_free_page_report, s);
> +}
> +
>  static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>  {
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> @@ -330,6 +389,7 @@ static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
>              if (id == dev->free_page_report_cmd_id) {
>                  dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
>              } else {
> +                dev->host_stop_free_page = false;
>                  dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>                  break;
>              }
> @@ -385,6 +445,10 @@ static void virtio_balloon_free_page_poll(void *opaque)
>      virtio_notify_config(vdev);
>      s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
>  
> +    if (s->free_page_wait_time) {
> +        balloon_free_page_change_timer(s, s->free_page_wait_time);
> +    }
> +
>      virtio_balloon_poll_free_page_hints(s);
>  }
>  
> @@ -395,7 +459,19 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  
>      config.num_pages = cpu_to_le32(dev->num_pages);
>      config.actual = cpu_to_le32(dev->actual);
> -    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
> +    if (dev->host_stop_free_page) {
> +        /*
> +         * Host is actively requesting to stop the free page reporting, send
> +         * the stop sign to the guest. This happens when the migration thread
> +         * has reached the phase to send pages to the destination while the
> +         * guest hasn't done the reporting.
> +         */
> +        config.free_page_report_cmd_id =
> +                                    VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +    } else {
> +        config.free_page_report_cmd_id =
> +                                    cpu_to_le32(dev->free_page_report_cmd_id);
> +    }
>  
>      trace_virtio_balloon_get_config(config.num_pages, config.actual);
>      memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
> @@ -539,6 +615,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>          s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
>          s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>          s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +        s->host_stop_free_page = false;
>      }
>      reset_stats(s);
>  }
> @@ -602,6 +679,11 @@ static void virtio_balloon_instance_init(Object *obj)
>                          balloon_stats_get_poll_interval,
>                          balloon_stats_set_poll_interval,
>                          NULL, s, NULL);
> +
> +    object_property_add(obj, "free-page-wait-time", "int",
> +                        balloon_free_page_get_wait_time,
> +                        balloon_free_page_set_wait_time,
> +                        NULL, s, NULL);
>  }
>  
>  static const VMStateDescription vmstate_virtio_balloon = {
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 6c75cca..3345104 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -2250,6 +2250,9 @@ static void virtio_balloon_pci_instance_init(Object *obj)
>      object_property_add_alias(obj, "guest-stats-polling-interval",
>                                OBJECT(&dev->vdev),
>                                "guest-stats-polling-interval", &error_abort);
> +    object_property_add_alias(obj, "free-page-wait-time",
> +                              OBJECT(&dev->vdev),
> +                              "free-page-wait-time", &error_abort);
>  }
>  
>  static const TypeInfo virtio_balloon_pci_info = {
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 11b4e01..c16855b 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -40,6 +40,8 @@ enum virtio_balloon_free_page_report_status {
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
>      VirtQueue *ivq, *dvq, *svq, *free_page_vq;
> +    /* Host is requesting the guest to stop free page reporting */
> +    bool host_stop_free_page;
>      uint32_t free_page_report_status;
>      uint32_t num_pages;
>      uint32_t actual;
> @@ -49,8 +51,10 @@ typedef struct VirtIOBalloon {
>      VirtQueueElement *stats_vq_elem;
>      size_t stats_vq_offset;
>      QEMUTimer *stats_timer;
> +    QEMUTimer *free_page_timer;
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
> +    int64_t free_page_wait_time;
>      uint32_t host_features;
>  } VirtIOBalloon;
>  
> -- 
> 1.8.3.1

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
  2018-02-06 11:08   ` [virtio-dev] " Wei Wang
@ 2018-02-06 23:57     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-06 23:57 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:18PM +0800, Wei Wang wrote:
> Use the free page reporting feature from the balloon device to clear the
> bits corresponding to guest free pages from the dirty bitmap, so that the
> free memory are not sent.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>

What the patch seems to do is stop migration
completely - blocking until guest completes the reporting.

Which makes no sense to me, since it's just an optimization.
Why not proceed with the migration? What do we have to loose?

I imagine some people might want to defer migration until reporting
completes to reduce the load on the network. Fair enough,
but it does not look like you actually measured the reduction
in traffic. So I suggest you work on that as a separate feature.


> ---
>  migration/ram.c | 24 ++++++++++++++++++++----
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index d6f462c..4fe16d2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -49,6 +49,7 @@
>  #include "qemu/rcu_queue.h"
>  #include "migration/colo.h"
>  #include "migration/block.h"
> +#include "sysemu/balloon.h"
>  
>  /***********************************************************/
>  /* ram save/restore */
> @@ -206,6 +207,10 @@ struct RAMState {
>      uint32_t last_version;
>      /* We are in the first round */
>      bool ram_bulk_stage;
> +    /* The feature, skipping the transfer of free pages, is supported */
> +    bool free_page_support;
> +    /* Skip the transfer of free pages in the bulk stage */
> +    bool free_page_done;
>      /* How many times we have dirty too many pages */
>      int dirty_rate_high_cnt;
>      /* these variables are used for bitmap sync */
> @@ -773,7 +778,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>      unsigned long *bitmap = rb->bmap;
>      unsigned long next;
>  
> -    if (rs->ram_bulk_stage && start > 0) {
> +    if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
>          next = start + 1;
>      } else {
>          next = find_next_bit(bitmap, size, start);
> @@ -1653,6 +1658,8 @@ static void ram_state_reset(RAMState *rs)
>      rs->last_page = 0;
>      rs->last_version = ram_list.version;
>      rs->ram_bulk_stage = true;
> +    rs->free_page_support = balloon_free_page_support();
> +    rs->free_page_done = false;
>  }
>  
>  #define MAX_WAIT 50 /* ms, half buffered_file limit */
> @@ -2135,7 +2142,7 @@ static int ram_state_init(RAMState **rsp)
>      return 0;
>  }
>  
> -static void ram_list_init_bitmaps(void)
> +static void ram_list_init_bitmaps(RAMState *rs)
>  {
>      RAMBlock *block;
>      unsigned long pages;
> @@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
>          QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>              pages = block->max_length >> TARGET_PAGE_BITS;
>              block->bmap = bitmap_new(pages);
> -            bitmap_set(block->bmap, 0, pages);
> +            if (rs->free_page_support) {
> +                bitmap_set(block->bmap, 1, pages);
> +            } else {
> +                bitmap_set(block->bmap, 0, pages);
> +            }
>              if (migrate_postcopy_ram()) {
>                  block->unsentmap = bitmap_new(pages);
>                  bitmap_set(block->unsentmap, 0, pages);
> @@ -2161,7 +2172,7 @@ static void ram_init_bitmaps(RAMState *rs)
>      qemu_mutex_lock_ramlist();
>      rcu_read_lock();
>  
> -    ram_list_init_bitmaps();
> +    ram_list_init_bitmaps(rs);
>      memory_global_dirty_log_start();
>      migration_bitmap_sync(rs);
>  
> @@ -2275,6 +2286,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  
>      ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>  
> +    if (rs->free_page_support && !rs->free_page_done) {
> +        balloon_free_page_poll();
> +        rs->free_page_done = true;
> +    }
> +
>      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      i = 0;
>      while ((ret = qemu_file_rate_limit(f)) == 0) {
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 2/3] migration: use the free page reporting feature from balloon
@ 2018-02-06 23:57     ` Michael S. Tsirkin
  0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-06 23:57 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:18PM +0800, Wei Wang wrote:
> Use the free page reporting feature from the balloon device to clear the
> bits corresponding to guest free pages from the dirty bitmap, so that the
> free memory are not sent.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>

What the patch seems to do is stop migration
completely - blocking until guest completes the reporting.

Which makes no sense to me, since it's just an optimization.
Why not proceed with the migration? What do we have to loose?

I imagine some people might want to defer migration until reporting
completes to reduce the load on the network. Fair enough,
but it does not look like you actually measured the reduction
in traffic. So I suggest you work on that as a separate feature.


> ---
>  migration/ram.c | 24 ++++++++++++++++++++----
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index d6f462c..4fe16d2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -49,6 +49,7 @@
>  #include "qemu/rcu_queue.h"
>  #include "migration/colo.h"
>  #include "migration/block.h"
> +#include "sysemu/balloon.h"
>  
>  /***********************************************************/
>  /* ram save/restore */
> @@ -206,6 +207,10 @@ struct RAMState {
>      uint32_t last_version;
>      /* We are in the first round */
>      bool ram_bulk_stage;
> +    /* The feature, skipping the transfer of free pages, is supported */
> +    bool free_page_support;
> +    /* Skip the transfer of free pages in the bulk stage */
> +    bool free_page_done;
>      /* How many times we have dirty too many pages */
>      int dirty_rate_high_cnt;
>      /* these variables are used for bitmap sync */
> @@ -773,7 +778,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>      unsigned long *bitmap = rb->bmap;
>      unsigned long next;
>  
> -    if (rs->ram_bulk_stage && start > 0) {
> +    if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
>          next = start + 1;
>      } else {
>          next = find_next_bit(bitmap, size, start);
> @@ -1653,6 +1658,8 @@ static void ram_state_reset(RAMState *rs)
>      rs->last_page = 0;
>      rs->last_version = ram_list.version;
>      rs->ram_bulk_stage = true;
> +    rs->free_page_support = balloon_free_page_support();
> +    rs->free_page_done = false;
>  }
>  
>  #define MAX_WAIT 50 /* ms, half buffered_file limit */
> @@ -2135,7 +2142,7 @@ static int ram_state_init(RAMState **rsp)
>      return 0;
>  }
>  
> -static void ram_list_init_bitmaps(void)
> +static void ram_list_init_bitmaps(RAMState *rs)
>  {
>      RAMBlock *block;
>      unsigned long pages;
> @@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
>          QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>              pages = block->max_length >> TARGET_PAGE_BITS;
>              block->bmap = bitmap_new(pages);
> -            bitmap_set(block->bmap, 0, pages);
> +            if (rs->free_page_support) {
> +                bitmap_set(block->bmap, 1, pages);
> +            } else {
> +                bitmap_set(block->bmap, 0, pages);
> +            }
>              if (migrate_postcopy_ram()) {
>                  block->unsentmap = bitmap_new(pages);
>                  bitmap_set(block->unsentmap, 0, pages);
> @@ -2161,7 +2172,7 @@ static void ram_init_bitmaps(RAMState *rs)
>      qemu_mutex_lock_ramlist();
>      rcu_read_lock();
>  
> -    ram_list_init_bitmaps();
> +    ram_list_init_bitmaps(rs);
>      memory_global_dirty_log_start();
>      migration_bitmap_sync(rs);
>  
> @@ -2275,6 +2286,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  
>      ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>  
> +    if (rs->free_page_support && !rs->free_page_done) {
> +        balloon_free_page_poll();
> +        rs->free_page_done = true;
> +    }
> +
>      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      i = 0;
>      while ((ret = qemu_file_rate_limit(f)) == 0) {
> -- 
> 1.8.3.1

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
  2018-02-06 11:08 ` [virtio-dev] " Wei Wang
@ 2018-02-07  0:02   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-07  0:02 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:16PM +0800, Wei Wang wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not transferred by the migration thread to the destination.
> 
> Please see the driver patch link for test results:
> https://lkml.org/lkml/2018/2/4/60
> 
> ChangeLog:
> v1->v2: 
>     1) virtio-balloon
>         - use subsections to save free_page_report_cmd_id;
>         - poll the free page vq after sending a cmd id to the driver;
>         - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>         - virtio_balloon_poll_free_page_hints: handle the corner case
>           that the free page block reported from the driver may cross
>           the RAMBlock boundary.
>     2) migration/ram.c
>         - use balloon_free_page_poll to start the optimization
> 
> Wei Wang (3):
>   virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>   migration: use the free page reporting feature from balloon
>   virtio-balloon: add a timer to limit the free page report waiting time

This feature needs in-tree documentation about possible ways to use it,
tradeoffs involved etc.


>  balloon.c                                       |  39 ++--
>  hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
>  hw/virtio/virtio-pci.c                          |   3 +
>  include/hw/virtio/virtio-balloon.h              |  15 +-
>  include/migration/misc.h                        |   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 +
>  include/sysemu/balloon.h                        |  12 +-
>  migration/ram.c                                 |  34 +++-
>  8 files changed, 307 insertions(+), 33 deletions(-)
> 
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 0/3] virtio-balloon: free page hint reporting support
@ 2018-02-07  0:02   ` Michael S. Tsirkin
  0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-07  0:02 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:16PM +0800, Wei Wang wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not transferred by the migration thread to the destination.
> 
> Please see the driver patch link for test results:
> https://lkml.org/lkml/2018/2/4/60
> 
> ChangeLog:
> v1->v2: 
>     1) virtio-balloon
>         - use subsections to save free_page_report_cmd_id;
>         - poll the free page vq after sending a cmd id to the driver;
>         - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>         - virtio_balloon_poll_free_page_hints: handle the corner case
>           that the free page block reported from the driver may cross
>           the RAMBlock boundary.
>     2) migration/ram.c
>         - use balloon_free_page_poll to start the optimization
> 
> Wei Wang (3):
>   virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>   migration: use the free page reporting feature from balloon
>   virtio-balloon: add a timer to limit the free page report waiting time

This feature needs in-tree documentation about possible ways to use it,
tradeoffs involved etc.


>  balloon.c                                       |  39 ++--
>  hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
>  hw/virtio/virtio-pci.c                          |   3 +
>  include/hw/virtio/virtio-balloon.h              |  15 +-
>  include/migration/misc.h                        |   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 +
>  include/sysemu/balloon.h                        |  12 +-
>  migration/ram.c                                 |  34 +++-
>  8 files changed, 307 insertions(+), 33 deletions(-)
> 
> -- 
> 1.8.3.1

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  2018-02-06 11:08   ` [virtio-dev] " Wei Wang
@ 2018-02-07  1:04     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-07  1:04 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:17PM +0800, Wei Wang wrote:
> The new feature enables the virtio-balloon device to receive the hint of
> guest free pages from the free page vq, and clears the corresponding bits
> of the free page from the dirty bitmap, so that those free pages are not
> transferred by the migration thread.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> ---
>  balloon.c                                       |  39 +++++--
>  hw/virtio/virtio-balloon.c                      | 145 +++++++++++++++++++++---
>  include/hw/virtio/virtio-balloon.h              |  11 +-
>  include/migration/misc.h                        |   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 ++
>  include/sysemu/balloon.h                        |  12 +-
>  migration/ram.c                                 |  10 ++
>  7 files changed, 198 insertions(+), 29 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index 1d720ff..0f0b30c 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -36,6 +36,8 @@
>  
>  static QEMUBalloonEvent *balloon_event_fn;
>  static QEMUBalloonStatus *balloon_stat_fn;
> +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn;
> +static QEMUBalloonFreePagePoll *balloon_free_page_poll_fn;
>  static void *balloon_opaque;
>  static bool balloon_inhibited;
>  
> @@ -64,19 +66,34 @@ static bool have_balloon(Error **errp)
>      return true;
>  }
>  
> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -                             QEMUBalloonStatus *stat_func, void *opaque)
> +bool balloon_free_page_support(void)
>  {
> -    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
> -        /* We're already registered one balloon handler.  How many can
> -         * a guest really have?
> -         */
> -        return -1;
> +    return balloon_free_page_support_fn &&
> +           balloon_free_page_support_fn(balloon_opaque);
> +}
> +
> +void balloon_free_page_poll(void)
> +{
> +    balloon_free_page_poll_fn(balloon_opaque);
> +}
> +
> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
> +                              QEMUBalloonStatus *stat_fn,
> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
> +                              void *opaque)
> +{
> +    if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn ||
> +        balloon_free_page_poll_fn || balloon_opaque) {
> +        /* We already registered one balloon handler. */
> +        return;
>      }
> -    balloon_event_fn = event_func;
> -    balloon_stat_fn = stat_func;
> +
> +    balloon_event_fn = event_fn;
> +    balloon_stat_fn = stat_fn;
> +    balloon_free_page_support_fn = free_page_support_fn;
> +    balloon_free_page_poll_fn = free_page_poll_fn;
>      balloon_opaque = opaque;
> -    return 0;
>  }
>  
>  void qemu_remove_balloon_handler(void *opaque)
> @@ -86,6 +103,8 @@ void qemu_remove_balloon_handler(void *opaque)
>      }
>      balloon_event_fn = NULL;
>      balloon_stat_fn = NULL;
> +    balloon_free_page_support_fn = NULL;
> +    balloon_free_page_poll_fn = NULL;
>      balloon_opaque = NULL;
>  }
>  
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 14e08d2..b424d4e 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -23,6 +23,7 @@
>  #include "hw/virtio/virtio-balloon.h"
>  #include "sysemu/kvm.h"
>  #include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
>  #include "qapi/visitor.h"
>  #include "qapi-event.h"
>  #include "trace.h"
> @@ -30,6 +31,7 @@
>  
>  #include "hw/virtio/virtio-bus.h"
>  #include "hw/virtio/virtio-access.h"
> +#include "migration/misc.h"
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> @@ -305,6 +307,87 @@ out:
>      }
>  }
>  
> +static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
> +{
> +    VirtQueueElement *elem;
> +    VirtQueue *vq = dev->free_page_vq;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +    bool page_poisoning = virtio_vdev_has_feature(vdev,
> +                                              VIRTIO_BALLOON_F_PAGE_POISON);
> +    uint32_t id;
> +
> +    /* Poll the vq till a stop cmd id is received */
> +    while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
> +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +        if (!elem) {
> +            continue;
> +        }
> +
> +        if (elem->out_num) {
> +            iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(uint32_t));
> +            virtqueue_push(vq, elem, sizeof(id));
> +            g_free(elem);
> +            if (id == dev->free_page_report_cmd_id) {
> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
> +            } else {
> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;


So if migration on source times out, then you migrate to
another destination, this will cancel the in-progress reporting
due to while loop above.  Probably not what was intended.

> +                break;
> +            }
> +        }
> +
> +        if (elem->in_num) {
> +            RAMBlock *block;
> +            ram_addr_t offset;
> +            void *base;
> +            size_t total_len, len;
> +
> +            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START &&
> +                !page_poisoning) {

So when poisoning is enabled, you can't skip the pages?

I suspect if poison is 0 you actually can.



> +                base = elem->in_sg[0].iov_base;
> +                total_len = elem->in_sg[0].iov_len;
> +                len = total_len;

the below list looks like it'd be a better API.
How about an API that just gives hints to qemu?
qemu_guest_page_free_hint(start, len)?

> +                while (total_len > 0) {
> +                    block = qemu_ram_block_from_host(base, false, &offset);
> +                    if (unlikely(offset + total_len > block->used_length)) {
> +                        len = block->used_length - offset;
> +                        base += len;
> +                    }
> +                    skip_free_pages_from_dirty_bitmap(block, offset, len);

For post-copy migration, this does not look like it will DTRT.

Doesn't look like it will DTRT for tcg either.

And generally, this only works as long as you don't call log_sync.

Once you call log_sync, you can't rely on old hints, and you must increment
command id.

which isn't a good API, in my opinion.

> +                    total_len -= len;
> +                }
> +            }
> +            virtqueue_push(vq, elem, total_len);
> +            g_free(elem);
> +        }
> +    }
> +}
> +
> +static bool virtio_balloon_free_page_support(void *opaque)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +
> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
> +}
> +
> +static void virtio_balloon_free_page_poll(void *opaque)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +
> +    if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) {
> +        s->free_page_report_cmd_id = 1;
> +    } else {
> +        s->free_page_report_cmd_id++;
> +    }

I think it is prudent to reserve a range of command IDs that we don't use.
These can then be used later for more commands.

E.g. how about VIRTIO_BALLOON_FREE_PAGE_REPORT_MIN_ID 0x80000000
and let the ID go between 0x80000000 and UINT_MAX?

No guest changes needed, but patching uapi in linux to add the enum
might not be a bad idea.


> +
> +    virtio_notify_config(vdev);
> +    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
> +
> +    virtio_balloon_poll_free_page_hints(s);
> +}
> +
>  static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
> @@ -312,6 +395,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  
>      config.num_pages = cpu_to_le32(dev->num_pages);
>      config.actual = cpu_to_le32(dev->actual);
> +    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
>  
>      trace_virtio_balloon_get_config(config.num_pages, config.actual);
>      memcpy(config_data, &config, sizeof(struct virtio_balloon_config));

You need to set config.poison_val as well.

> @@ -365,6 +449,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
>                          ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT),
>                          &error_abort);
>      }
> +    dev->poison_val = le32_to_cpu(config.poison_val);

This will have to be migrated.


>      trace_virtio_balloon_set_config(dev->actual, oldactual);
>  }
>  
> @@ -374,6 +459,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>      f |= dev->host_features;
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);

Needs a compat property to avoid breaking cross-version migration.

>      return f;
>  }
>  
> @@ -410,6 +496,17 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>      return 0;
>  }
>  
> +static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> +    .name = "virtio-balloon-device/free-page-report",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = virtio_balloon_free_page_support,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static const VMStateDescription vmstate_virtio_balloon_device = {
>      .name = "virtio-balloon-device",
>      .version_id = 1,
> @@ -420,30 +517,29 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
>          VMSTATE_UINT32(actual, VirtIOBalloon),
>          VMSTATE_END_OF_LIST()
>      },
> +    .subsections = (const VMStateDescription * []) {
> +        &vmstate_virtio_balloon_free_page_report,
> +        NULL
> +    }
>  };
>  
>  static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>      VirtIOBalloon *s = VIRTIO_BALLOON(dev);
> -    int ret;
>  
>      virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
>                  sizeof(struct virtio_balloon_config));
>  
> -    ret = qemu_add_balloon_handler(virtio_balloon_to_target,
> -                                   virtio_balloon_stat, s);
> -
> -    if (ret < 0) {
> -        error_setg(errp, "Only one balloon device is supported");
> -        virtio_cleanup(vdev);
> -        return;
> -    }
> -
>      s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
> -
> +    if (virtio_has_feature(s->host_features,
> +                           VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
> +        s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
> +        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
> +        s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +    }
>      reset_stats(s);
>  }
>  
> @@ -472,11 +568,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
>  
> -    if (!s->stats_vq_elem && vdev->vm_running &&
> -        (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) {
> -        /* poll stats queue for the element we have discarded when the VM
> -         * was stopped */
> -        virtio_balloon_receive_stats(vdev, s->svq);
> +    if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +        if (!s->stats_vq_elem && vdev->vm_running &&
> +            virtqueue_rewind(s->svq, 1)) {
> +            /*
> +             * Poll stats queue for the element we have discarded when the VM
> +             * was stopped.
> +             */
> +            virtio_balloon_receive_stats(vdev, s->svq);
> +        }
> +
> +        if (virtio_balloon_free_page_support(s)) {
> +            qemu_add_balloon_handler(virtio_balloon_to_target,
> +                                     virtio_balloon_stat,
> +                                     virtio_balloon_free_page_support,
> +                                     virtio_balloon_free_page_poll,
> +                                     s);
> +        } else {
> +            qemu_add_balloon_handler(virtio_balloon_to_target,
> +                                     virtio_balloon_stat, NULL, NULL, s);
> +        }
>      }
>  }
>  
> @@ -506,6 +617,8 @@ static const VMStateDescription vmstate_virtio_balloon = {
>  static Property virtio_balloon_properties[] = {
>      DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
>                      VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
> +    DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
> +                    VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 1ea13bd..11b4e01 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -31,11 +31,20 @@ typedef struct virtio_balloon_stat_modern {
>         uint64_t val;
>  } VirtIOBalloonStatModern;
>  
> +enum virtio_balloon_free_page_report_status {
> +    FREE_PAGE_REPORT_S_REQUESTED,
> +    FREE_PAGE_REPORT_S_START,
> +    FREE_PAGE_REPORT_S_STOP,
> +};
> +
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
> -    VirtQueue *ivq, *dvq, *svq;
> +    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
> +    uint32_t free_page_report_status;
>      uint32_t num_pages;
>      uint32_t actual;
> +    uint32_t free_page_report_cmd_id;
> +    uint32_t poison_val;
>      uint64_t stats[VIRTIO_BALLOON_S_NR];
>      VirtQueueElement *stats_vq_elem;
>      size_t stats_vq_offset;
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 77fd4f5..6df419c 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -14,11 +14,14 @@
>  #ifndef MIGRATION_MISC_H
>  #define MIGRATION_MISC_H
>  
> +#include "exec/cpu-common.h"
>  #include "qemu/notify.h"
>  
>  /* migration/ram.c */
>  
>  void ram_mig_init(void);
> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
> +                                       size_t len);
>  
>  /* migration/block.c */
>  
> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> index 9d06ccd..19d0d8b 100644
> --- a/include/standard-headers/linux/virtio_balloon.h
> +++ b/include/standard-headers/linux/virtio_balloon.h
> @@ -34,15 +34,22 @@
>  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
> +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
> +#define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
>  
> +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
>  struct virtio_balloon_config {
>  	/* Number of pages host wants Guest to give up. */
>  	uint32_t num_pages;
>  	/* Number of pages we've actually got in balloon. */
>  	uint32_t actual;
> +	/* Free page report command id, readonly by guest */
> +	uint32_t free_page_report_cmd_id;
> +	/* Stores PAGE_POISON if page poisoning is in use */
> +	uint32_t poison_val;
>  };
>  
>  #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
> index af49e19..53db4dc 100644
> --- a/include/sysemu/balloon.h
> +++ b/include/sysemu/balloon.h
> @@ -18,11 +18,19 @@
>  
>  typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>  typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
> +typedef bool (QEMUBalloonFreePageSupport)(void *opaque);
> +typedef void (QEMUBalloonFreePagePoll)(void *opaque);
>  
> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -			     QEMUBalloonStatus *stat_func, void *opaque);
>  void qemu_remove_balloon_handler(void *opaque);
>  bool qemu_balloon_is_inhibited(void);
>  void qemu_balloon_inhibit(bool state);
> +bool balloon_free_page_support(void);
> +void balloon_free_page_poll(void);
> +
> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
> +                              QEMUBalloonStatus *stat_fn,
> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
> +                              void *opaque);
>  
>  #endif
> diff --git a/migration/ram.c b/migration/ram.c
> index cb1950f..d6f462c 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2186,6 +2186,16 @@ static int ram_init_all(RAMState **rsp)
>      return 0;
>  }
>  
> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
> +                                       size_t len)
> +{
> +    long start = offset >> TARGET_PAGE_BITS,
> +         nr = len >> TARGET_PAGE_BITS;

start is the length calculated from length?
What does this mean?

I suspect a typo, and this makes me doubt this was tested
under any kind of stress. When poking at the migration
stream, you can't just test with a completely idle guest.

After having reviewed 26 versions of the guest support code,
I admit to a bit of a burnout. How about you work harder
on isolating migration bits from virtio bits, then get
migration core devs review these migration parts of
the patchset?


> +
> +    bitmap_clear(block->bmap, start, nr);
> +    ram_state->migration_dirty_pages -= nr;
> +}
> +
>  /*
>   * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>   * long-running RCU critical section.  When rcu-reclaims in the code
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
@ 2018-02-07  1:04     ` Michael S. Tsirkin
  0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-07  1:04 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On Tue, Feb 06, 2018 at 07:08:17PM +0800, Wei Wang wrote:
> The new feature enables the virtio-balloon device to receive the hint of
> guest free pages from the free page vq, and clears the corresponding bits
> of the free page from the dirty bitmap, so that those free pages are not
> transferred by the migration thread.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> ---
>  balloon.c                                       |  39 +++++--
>  hw/virtio/virtio-balloon.c                      | 145 +++++++++++++++++++++---
>  include/hw/virtio/virtio-balloon.h              |  11 +-
>  include/migration/misc.h                        |   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 ++
>  include/sysemu/balloon.h                        |  12 +-
>  migration/ram.c                                 |  10 ++
>  7 files changed, 198 insertions(+), 29 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index 1d720ff..0f0b30c 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -36,6 +36,8 @@
>  
>  static QEMUBalloonEvent *balloon_event_fn;
>  static QEMUBalloonStatus *balloon_stat_fn;
> +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn;
> +static QEMUBalloonFreePagePoll *balloon_free_page_poll_fn;
>  static void *balloon_opaque;
>  static bool balloon_inhibited;
>  
> @@ -64,19 +66,34 @@ static bool have_balloon(Error **errp)
>      return true;
>  }
>  
> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -                             QEMUBalloonStatus *stat_func, void *opaque)
> +bool balloon_free_page_support(void)
>  {
> -    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
> -        /* We're already registered one balloon handler.  How many can
> -         * a guest really have?
> -         */
> -        return -1;
> +    return balloon_free_page_support_fn &&
> +           balloon_free_page_support_fn(balloon_opaque);
> +}
> +
> +void balloon_free_page_poll(void)
> +{
> +    balloon_free_page_poll_fn(balloon_opaque);
> +}
> +
> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
> +                              QEMUBalloonStatus *stat_fn,
> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
> +                              void *opaque)
> +{
> +    if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn ||
> +        balloon_free_page_poll_fn || balloon_opaque) {
> +        /* We already registered one balloon handler. */
> +        return;
>      }
> -    balloon_event_fn = event_func;
> -    balloon_stat_fn = stat_func;
> +
> +    balloon_event_fn = event_fn;
> +    balloon_stat_fn = stat_fn;
> +    balloon_free_page_support_fn = free_page_support_fn;
> +    balloon_free_page_poll_fn = free_page_poll_fn;
>      balloon_opaque = opaque;
> -    return 0;
>  }
>  
>  void qemu_remove_balloon_handler(void *opaque)
> @@ -86,6 +103,8 @@ void qemu_remove_balloon_handler(void *opaque)
>      }
>      balloon_event_fn = NULL;
>      balloon_stat_fn = NULL;
> +    balloon_free_page_support_fn = NULL;
> +    balloon_free_page_poll_fn = NULL;
>      balloon_opaque = NULL;
>  }
>  
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 14e08d2..b424d4e 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -23,6 +23,7 @@
>  #include "hw/virtio/virtio-balloon.h"
>  #include "sysemu/kvm.h"
>  #include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
>  #include "qapi/visitor.h"
>  #include "qapi-event.h"
>  #include "trace.h"
> @@ -30,6 +31,7 @@
>  
>  #include "hw/virtio/virtio-bus.h"
>  #include "hw/virtio/virtio-access.h"
> +#include "migration/misc.h"
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> @@ -305,6 +307,87 @@ out:
>      }
>  }
>  
> +static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
> +{
> +    VirtQueueElement *elem;
> +    VirtQueue *vq = dev->free_page_vq;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +    bool page_poisoning = virtio_vdev_has_feature(vdev,
> +                                              VIRTIO_BALLOON_F_PAGE_POISON);
> +    uint32_t id;
> +
> +    /* Poll the vq till a stop cmd id is received */
> +    while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
> +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +        if (!elem) {
> +            continue;
> +        }
> +
> +        if (elem->out_num) {
> +            iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(uint32_t));
> +            virtqueue_push(vq, elem, sizeof(id));
> +            g_free(elem);
> +            if (id == dev->free_page_report_cmd_id) {
> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
> +            } else {
> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;


So if migration on source times out, then you migrate to
another destination, this will cancel the in-progress reporting
due to while loop above.  Probably not what was intended.

> +                break;
> +            }
> +        }
> +
> +        if (elem->in_num) {
> +            RAMBlock *block;
> +            ram_addr_t offset;
> +            void *base;
> +            size_t total_len, len;
> +
> +            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START &&
> +                !page_poisoning) {

So when poisoning is enabled, you can't skip the pages?

I suspect if poison is 0 you actually can.



> +                base = elem->in_sg[0].iov_base;
> +                total_len = elem->in_sg[0].iov_len;
> +                len = total_len;

the below list looks like it'd be a better API.
How about an API that just gives hints to qemu?
qemu_guest_page_free_hint(start, len)?

> +                while (total_len > 0) {
> +                    block = qemu_ram_block_from_host(base, false, &offset);
> +                    if (unlikely(offset + total_len > block->used_length)) {
> +                        len = block->used_length - offset;
> +                        base += len;
> +                    }
> +                    skip_free_pages_from_dirty_bitmap(block, offset, len);

For post-copy migration, this does not look like it will DTRT.

Doesn't look like it will DTRT for tcg either.

And generally, this only works as long as you don't call log_sync.

Once you call log_sync, you can't rely on old hints, and you must increment
command id.

which isn't a good API, in my opinion.

> +                    total_len -= len;
> +                }
> +            }
> +            virtqueue_push(vq, elem, total_len);
> +            g_free(elem);
> +        }
> +    }
> +}
> +
> +static bool virtio_balloon_free_page_support(void *opaque)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +
> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
> +}
> +
> +static void virtio_balloon_free_page_poll(void *opaque)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +
> +    if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) {
> +        s->free_page_report_cmd_id = 1;
> +    } else {
> +        s->free_page_report_cmd_id++;
> +    }

I think it is prudent to reserve a range of command IDs that we don't use.
These can then be used later for more commands.

E.g. how about VIRTIO_BALLOON_FREE_PAGE_REPORT_MIN_ID 0x80000000
and let the ID go between 0x80000000 and UINT_MAX?

No guest changes needed, but patching uapi in linux to add the enum
might not be a bad idea.


> +
> +    virtio_notify_config(vdev);
> +    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
> +
> +    virtio_balloon_poll_free_page_hints(s);
> +}
> +
>  static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
> @@ -312,6 +395,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  
>      config.num_pages = cpu_to_le32(dev->num_pages);
>      config.actual = cpu_to_le32(dev->actual);
> +    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
>  
>      trace_virtio_balloon_get_config(config.num_pages, config.actual);
>      memcpy(config_data, &config, sizeof(struct virtio_balloon_config));

You need to set config.poison_val as well.

> @@ -365,6 +449,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
>                          ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT),
>                          &error_abort);
>      }
> +    dev->poison_val = le32_to_cpu(config.poison_val);

This will have to be migrated.


>      trace_virtio_balloon_set_config(dev->actual, oldactual);
>  }
>  
> @@ -374,6 +459,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>      f |= dev->host_features;
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);

Needs a compat property to avoid breaking cross-version migration.

>      return f;
>  }
>  
> @@ -410,6 +496,17 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>      return 0;
>  }
>  
> +static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> +    .name = "virtio-balloon-device/free-page-report",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = virtio_balloon_free_page_support,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static const VMStateDescription vmstate_virtio_balloon_device = {
>      .name = "virtio-balloon-device",
>      .version_id = 1,
> @@ -420,30 +517,29 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
>          VMSTATE_UINT32(actual, VirtIOBalloon),
>          VMSTATE_END_OF_LIST()
>      },
> +    .subsections = (const VMStateDescription * []) {
> +        &vmstate_virtio_balloon_free_page_report,
> +        NULL
> +    }
>  };
>  
>  static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>      VirtIOBalloon *s = VIRTIO_BALLOON(dev);
> -    int ret;
>  
>      virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
>                  sizeof(struct virtio_balloon_config));
>  
> -    ret = qemu_add_balloon_handler(virtio_balloon_to_target,
> -                                   virtio_balloon_stat, s);
> -
> -    if (ret < 0) {
> -        error_setg(errp, "Only one balloon device is supported");
> -        virtio_cleanup(vdev);
> -        return;
> -    }
> -
>      s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
> -
> +    if (virtio_has_feature(s->host_features,
> +                           VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
> +        s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
> +        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
> +        s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +    }
>      reset_stats(s);
>  }
>  
> @@ -472,11 +568,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
>  
> -    if (!s->stats_vq_elem && vdev->vm_running &&
> -        (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) {
> -        /* poll stats queue for the element we have discarded when the VM
> -         * was stopped */
> -        virtio_balloon_receive_stats(vdev, s->svq);
> +    if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +        if (!s->stats_vq_elem && vdev->vm_running &&
> +            virtqueue_rewind(s->svq, 1)) {
> +            /*
> +             * Poll stats queue for the element we have discarded when the VM
> +             * was stopped.
> +             */
> +            virtio_balloon_receive_stats(vdev, s->svq);
> +        }
> +
> +        if (virtio_balloon_free_page_support(s)) {
> +            qemu_add_balloon_handler(virtio_balloon_to_target,
> +                                     virtio_balloon_stat,
> +                                     virtio_balloon_free_page_support,
> +                                     virtio_balloon_free_page_poll,
> +                                     s);
> +        } else {
> +            qemu_add_balloon_handler(virtio_balloon_to_target,
> +                                     virtio_balloon_stat, NULL, NULL, s);
> +        }
>      }
>  }
>  
> @@ -506,6 +617,8 @@ static const VMStateDescription vmstate_virtio_balloon = {
>  static Property virtio_balloon_properties[] = {
>      DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
>                      VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
> +    DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
> +                    VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 1ea13bd..11b4e01 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -31,11 +31,20 @@ typedef struct virtio_balloon_stat_modern {
>         uint64_t val;
>  } VirtIOBalloonStatModern;
>  
> +enum virtio_balloon_free_page_report_status {
> +    FREE_PAGE_REPORT_S_REQUESTED,
> +    FREE_PAGE_REPORT_S_START,
> +    FREE_PAGE_REPORT_S_STOP,
> +};
> +
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
> -    VirtQueue *ivq, *dvq, *svq;
> +    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
> +    uint32_t free_page_report_status;
>      uint32_t num_pages;
>      uint32_t actual;
> +    uint32_t free_page_report_cmd_id;
> +    uint32_t poison_val;
>      uint64_t stats[VIRTIO_BALLOON_S_NR];
>      VirtQueueElement *stats_vq_elem;
>      size_t stats_vq_offset;
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 77fd4f5..6df419c 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -14,11 +14,14 @@
>  #ifndef MIGRATION_MISC_H
>  #define MIGRATION_MISC_H
>  
> +#include "exec/cpu-common.h"
>  #include "qemu/notify.h"
>  
>  /* migration/ram.c */
>  
>  void ram_mig_init(void);
> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
> +                                       size_t len);
>  
>  /* migration/block.c */
>  
> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> index 9d06ccd..19d0d8b 100644
> --- a/include/standard-headers/linux/virtio_balloon.h
> +++ b/include/standard-headers/linux/virtio_balloon.h
> @@ -34,15 +34,22 @@
>  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
> +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
> +#define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
>  
> +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
>  struct virtio_balloon_config {
>  	/* Number of pages host wants Guest to give up. */
>  	uint32_t num_pages;
>  	/* Number of pages we've actually got in balloon. */
>  	uint32_t actual;
> +	/* Free page report command id, readonly by guest */
> +	uint32_t free_page_report_cmd_id;
> +	/* Stores PAGE_POISON if page poisoning is in use */
> +	uint32_t poison_val;
>  };
>  
>  #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
> index af49e19..53db4dc 100644
> --- a/include/sysemu/balloon.h
> +++ b/include/sysemu/balloon.h
> @@ -18,11 +18,19 @@
>  
>  typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>  typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
> +typedef bool (QEMUBalloonFreePageSupport)(void *opaque);
> +typedef void (QEMUBalloonFreePagePoll)(void *opaque);
>  
> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -			     QEMUBalloonStatus *stat_func, void *opaque);
>  void qemu_remove_balloon_handler(void *opaque);
>  bool qemu_balloon_is_inhibited(void);
>  void qemu_balloon_inhibit(bool state);
> +bool balloon_free_page_support(void);
> +void balloon_free_page_poll(void);
> +
> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
> +                              QEMUBalloonStatus *stat_fn,
> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
> +                              void *opaque);
>  
>  #endif
> diff --git a/migration/ram.c b/migration/ram.c
> index cb1950f..d6f462c 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2186,6 +2186,16 @@ static int ram_init_all(RAMState **rsp)
>      return 0;
>  }
>  
> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
> +                                       size_t len)
> +{
> +    long start = offset >> TARGET_PAGE_BITS,
> +         nr = len >> TARGET_PAGE_BITS;

start is the length calculated from length?
What does this mean?

I suspect a typo, and this makes me doubt this was tested
under any kind of stress. When poking at the migration
stream, you can't just test with a completely idle guest.

After having reviewed 26 versions of the guest support code,
I admit to a bit of a burnout. How about you work harder
on isolating migration bits from virtio bits, then get
migration core devs review these migration parts of
the patchset?


> +
> +    bitmap_clear(block->bmap, start, nr);
> +    ram_state->migration_dirty_pages -= nr;
> +}
> +
>  /*
>   * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>   * long-running RCU critical section.  When rcu-reclaims in the code
> -- 
> 1.8.3.1

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
  2018-02-06 23:57     ` [virtio-dev] " Michael S. Tsirkin
@ 2018-02-08  3:54       ` Wei Wang
  -1 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-08  3:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/07/2018 07:57 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 06, 2018 at 07:08:18PM +0800, Wei Wang wrote:
>> Use the free page reporting feature from the balloon device to clear the
>> bits corresponding to guest free pages from the dirty bitmap, so that the
>> free memory are not sent.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> CC: Michael S. Tsirkin <mst@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
> What the patch seems to do is stop migration
> completely - blocking until guest completes the reporting.
>
> Which makes no sense to me, since it's just an optimization.
> Why not proceed with the migration? What do we have to loose?

If we want the optimization to run in parallel with the migration 
thread, we will need to create another polling thread, like 
multithreading compression. In that way, we will waste some host CPU. 
For example, the migration thread may proceed to send pages to the 
destination while the optimization thread is in progress, but those 
pages may turn out to be free pages (this is likely in the bulk stage) 
which don't need to be sent. In that case, why not let the migration 
thread wait a little bit (i.e. put the optimization into the migration 
thread) and proceed to do some useful things, instead of pretending to 
proceed but doing useless things?

The current plan of this patch is to skip free pages for the bulk stage 
only. I'm not sure if it would be useful for the 2nd stage onward, which 
basically relies on the dirty logging to send pages that have been 
written by the guest. For example, if the guest is not so active while 
live migration happens, there will be very few dirty bits. This 
optimization would be mostly clearing "0" bits from the dirty bitmap.


> I imagine some people might want to defer migration until reporting
> completes to reduce the load on the network. Fair enough,
> but it does not look like you actually measured the reduction
> in traffic. So I suggest you work on that as a separate feature.
>

I have the traffic data actually. Tested with 8G idle guest, Legacy v.s. 
Optimization: ~390MB v.s. ~337MB.
The legacy case has the zero page checking optimization, so the traffic 
reduction is not very obvious. But zero checking has much more overhead, 
which is demonstrated by the migration time (this optimization takes 
~14% of the legacy migration time).


Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 2/3] migration: use the free page reporting feature from balloon
@ 2018-02-08  3:54       ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-08  3:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/07/2018 07:57 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 06, 2018 at 07:08:18PM +0800, Wei Wang wrote:
>> Use the free page reporting feature from the balloon device to clear the
>> bits corresponding to guest free pages from the dirty bitmap, so that the
>> free memory are not sent.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> CC: Michael S. Tsirkin <mst@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
> What the patch seems to do is stop migration
> completely - blocking until guest completes the reporting.
>
> Which makes no sense to me, since it's just an optimization.
> Why not proceed with the migration? What do we have to loose?

If we want the optimization to run in parallel with the migration 
thread, we will need to create another polling thread, like 
multithreading compression. In that way, we will waste some host CPU. 
For example, the migration thread may proceed to send pages to the 
destination while the optimization thread is in progress, but those 
pages may turn out to be free pages (this is likely in the bulk stage) 
which don't need to be sent. In that case, why not let the migration 
thread wait a little bit (i.e. put the optimization into the migration 
thread) and proceed to do some useful things, instead of pretending to 
proceed but doing useless things?

The current plan of this patch is to skip free pages for the bulk stage 
only. I'm not sure if it would be useful for the 2nd stage onward, which 
basically relies on the dirty logging to send pages that have been 
written by the guest. For example, if the guest is not so active while 
live migration happens, there will be very few dirty bits. This 
optimization would be mostly clearing "0" bits from the dirty bitmap.


> I imagine some people might want to defer migration until reporting
> completes to reduce the load on the network. Fair enough,
> but it does not look like you actually measured the reduction
> in traffic. So I suggest you work on that as a separate feature.
>

I have the traffic data actually. Tested with 8G idle guest, Legacy v.s. 
Optimization: ~390MB v.s. ~337MB.
The legacy case has the zero page checking optimization, so the traffic 
reduction is not very obvious. But zero checking has much more overhead, 
which is demonstrated by the migration time (this optimization takes 
~14% of the legacy migration time).


Best,
Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
  2018-02-07  0:02   ` [virtio-dev] " Michael S. Tsirkin
@ 2018-02-08  5:38     ` Wei Wang
  -1 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-08  5:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/07/2018 08:02 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 06, 2018 at 07:08:16PM +0800, Wei Wang wrote:
>> This is the deivce part implementation to add a new feature,
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>> receives the guest free page hints from the driver and clears the
>> corresponding bits in the dirty bitmap, so that those free pages are
>> not transferred by the migration thread to the destination.
>>
>> Please see the driver patch link for test results:
>> https://lkml.org/lkml/2018/2/4/60
>>
>> ChangeLog:
>> v1->v2:
>>      1) virtio-balloon
>>          - use subsections to save free_page_report_cmd_id;
>>          - poll the free page vq after sending a cmd id to the driver;
>>          - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>>          - virtio_balloon_poll_free_page_hints: handle the corner case
>>            that the free page block reported from the driver may cross
>>            the RAMBlock boundary.
>>      2) migration/ram.c
>>          - use balloon_free_page_poll to start the optimization
>>
>> Wei Wang (3):
>>    virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>>    migration: use the free page reporting feature from balloon
>>    virtio-balloon: add a timer to limit the free page report waiting time
> This feature needs in-tree documentation about possible ways to use it,
> tradeoffs involved etc.

OK. I plan to add the documentation in later versions after we mostly 
finalize the QEMU part design.

Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 0/3] virtio-balloon: free page hint reporting support
@ 2018-02-08  5:38     ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-08  5:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/07/2018 08:02 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 06, 2018 at 07:08:16PM +0800, Wei Wang wrote:
>> This is the deivce part implementation to add a new feature,
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>> receives the guest free page hints from the driver and clears the
>> corresponding bits in the dirty bitmap, so that those free pages are
>> not transferred by the migration thread to the destination.
>>
>> Please see the driver patch link for test results:
>> https://lkml.org/lkml/2018/2/4/60
>>
>> ChangeLog:
>> v1->v2:
>>      1) virtio-balloon
>>          - use subsections to save free_page_report_cmd_id;
>>          - poll the free page vq after sending a cmd id to the driver;
>>          - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>>          - virtio_balloon_poll_free_page_hints: handle the corner case
>>            that the free page block reported from the driver may cross
>>            the RAMBlock boundary.
>>      2) migration/ram.c
>>          - use balloon_free_page_poll to start the optimization
>>
>> Wei Wang (3):
>>    virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>>    migration: use the free page reporting feature from balloon
>>    virtio-balloon: add a timer to limit the free page report waiting time
> This feature needs in-tree documentation about possible ways to use it,
> tradeoffs involved etc.

OK. I plan to add the documentation in later versions after we mostly 
finalize the QEMU part design.

Best,
Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
  2018-02-06 11:08 ` [virtio-dev] " Wei Wang
                   ` (4 preceding siblings ...)
  (?)
@ 2018-02-08 20:15 ` Dr. David Alan Gilbert
  2018-02-09  3:10     ` [virtio-dev] " Wei Wang
  -1 siblings, 1 reply; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-08 20:15 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, mst, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

* Wei Wang (wei.w.wang@intel.com) wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not transferred by the migration thread to the destination.
> 
> Please see the driver patch link for test results:
> https://lkml.org/lkml/2018/2/4/60

Hi Wei,
   I'll look at the code a bit more - but first some more basic
questions on that lkml post:

    a) The idle guest time thing is a nice result; can you just state
       what the host was, speed of connection, and what other options
       you were using?

    b) The workload test, the one with the kernel compile; you list
       the kernel compile time but don't mention any changes in the
       migration times of the ping-pong; can you give those times as
       well?

    c) What's your real workload that this is aimed at?
       Is it really for people migrating idle VMs - or do you have some
       NFV application in mind, if so why not include a figure for
       those?

Thanks,

Dave

> ChangeLog:
> v1->v2: 
>     1) virtio-balloon
>         - use subsections to save free_page_report_cmd_id;
>         - poll the free page vq after sending a cmd id to the driver;
>         - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>         - virtio_balloon_poll_free_page_hints: handle the corner case
>           that the free page block reported from the driver may cross
>           the RAMBlock boundary.
>     2) migration/ram.c
>         - use balloon_free_page_poll to start the optimization
> 
> Wei Wang (3):
>   virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>   migration: use the free page reporting feature from balloon
>   virtio-balloon: add a timer to limit the free page report waiting time
> 
>  balloon.c                                       |  39 ++--
>  hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
>  hw/virtio/virtio-pci.c                          |   3 +
>  include/hw/virtio/virtio-balloon.h              |  15 +-
>  include/migration/misc.h                        |   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 +
>  include/sysemu/balloon.h                        |  12 +-
>  migration/ram.c                                 |  34 +++-
>  8 files changed, 307 insertions(+), 33 deletions(-)
> 
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
  2018-02-08 20:15 ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2018-02-09  3:10     ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-09  3:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, virtio-dev, mst, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> This is the deivce part implementation to add a new feature,
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>> receives the guest free page hints from the driver and clears the
>> corresponding bits in the dirty bitmap, so that those free pages are
>> not transferred by the migration thread to the destination.
>>
>> Please see the driver patch link for test results:
>> https://lkml.org/lkml/2018/2/4/60
> Hi Wei,
>     I'll look at the code a bit more - but first some more basic
> questions on that lkml post:
>
>      a) The idle guest time thing is a nice result; can you just state
>         what the host was, speed of connection, and what other options
>         you were using?
>
>      b) The workload test, the one with the kernel compile; you list
>         the kernel compile time but don't mention any changes in the
>         migration times of the ping-pong; can you give those times as
>         well?
>
>      c) What's your real workload that this is aimed at?
>         Is it really for people migrating idle VMs - or do you have some
>         NFV application in mind, if so why not include a figure for
>         those?
>

Hi Dave,

Thanks for joining the review. Please see below info.

a) Environment info
     - Host:
         - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
         - kernel: 3.10.0

     - Guest:
         - kernel: 4.15.0
         - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G 
--mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true

     - Migration setup:
         - migrate_set_speed 0
         - migrate_set_downtime 0.01  (10ms)

b) Michael asked the same question on the kernel patches, I'll reply 
there with you cc-ed, so that kernel maintainers can also see it. Btw, 
do you have any other workloads you would suggest to have a try?

c) This feature is requested by many customers (e.g. general cloud 
vendors). It's for general use cases. As long as the guest has free 
memory, it will benefit from this optimization when doing migration. 
It's not specific for NFV usages, but for sure NFV will also benefit 
from this feature if we think about service chaining, where multiple VMs 
need to co-work with each other. In that case, migrating one VM will 
just break the working model, which means we will need to migrate all 
the VMs. A shorter migration time will be very helpful.


Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 0/3] virtio-balloon: free page hint reporting support
@ 2018-02-09  3:10     ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-09  3:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, virtio-dev, mst, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> This is the deivce part implementation to add a new feature,
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>> receives the guest free page hints from the driver and clears the
>> corresponding bits in the dirty bitmap, so that those free pages are
>> not transferred by the migration thread to the destination.
>>
>> Please see the driver patch link for test results:
>> https://lkml.org/lkml/2018/2/4/60
> Hi Wei,
>     I'll look at the code a bit more - but first some more basic
> questions on that lkml post:
>
>      a) The idle guest time thing is a nice result; can you just state
>         what the host was, speed of connection, and what other options
>         you were using?
>
>      b) The workload test, the one with the kernel compile; you list
>         the kernel compile time but don't mention any changes in the
>         migration times of the ping-pong; can you give those times as
>         well?
>
>      c) What's your real workload that this is aimed at?
>         Is it really for people migrating idle VMs - or do you have some
>         NFV application in mind, if so why not include a figure for
>         those?
>

Hi Dave,

Thanks for joining the review. Please see below info.

a) Environment info
     - Host:
         - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
         - kernel: 3.10.0

     - Guest:
         - kernel: 4.15.0
         - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G 
--mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true

     - Migration setup:
         - migrate_set_speed 0
         - migrate_set_downtime 0.01  (10ms)

b) Michael asked the same question on the kernel patches, I'll reply 
there with you cc-ed, so that kernel maintainers can also see it. Btw, 
do you have any other workloads you would suggest to have a try?

c) This feature is requested by many customers (e.g. general cloud 
vendors). It's for general use cases. As long as the guest has free 
memory, it will benefit from this optimization when doing migration. 
It's not specific for NFV usages, but for sure NFV will also benefit 
from this feature if we think about service chaining, where multiple VMs 
need to co-work with each other. In that case, migrating one VM will 
just break the working model, which means we will need to migrate all 
the VMs. A shorter migration time will be very helpful.


Best,
Wei



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
  2018-02-09  3:10     ` [virtio-dev] " Wei Wang
  (?)
@ 2018-02-09 10:53     ` Dr. David Alan Gilbert
  2018-02-26  4:42         ` [virtio-dev] " Wei Wang
  -1 siblings, 1 reply; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-09 10:53 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, mst, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

* Wei Wang (wei.w.wang@intel.com) wrote:
> On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
> > * Wei Wang (wei.w.wang@intel.com) wrote:
> > > This is the deivce part implementation to add a new feature,
> > > VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> > > receives the guest free page hints from the driver and clears the
> > > corresponding bits in the dirty bitmap, so that those free pages are
> > > not transferred by the migration thread to the destination.
> > > 
> > > Please see the driver patch link for test results:
> > > https://lkml.org/lkml/2018/2/4/60
> > Hi Wei,
> >     I'll look at the code a bit more - but first some more basic
> > questions on that lkml post:
> > 
> >      a) The idle guest time thing is a nice result; can you just state
> >         what the host was, speed of connection, and what other options
> >         you were using?
> > 
> >      b) The workload test, the one with the kernel compile; you list
> >         the kernel compile time but don't mention any changes in the
> >         migration times of the ping-pong; can you give those times as
> >         well?
> > 
> >      c) What's your real workload that this is aimed at?
> >         Is it really for people migrating idle VMs - or do you have some
> >         NFV application in mind, if so why not include a figure for
> >         those?
> > 
> 
> Hi Dave,
> 
> Thanks for joining the review. Please see below info.
> 
> a) Environment info
>     - Host:
>         - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>         - kernel: 3.10.0
> 
>     - Guest:
>         - kernel: 4.15.0
>         - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G
> --mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true
> 
>     - Migration setup:
>         - migrate_set_speed 0
>         - migrate_set_downtime 0.01  (10ms)

That's an unusually low downtime (and I'm not sure what setting the
speed to 0 does!).

> b) Michael asked the same question on the kernel patches, I'll reply there
> with you cc-ed, so that kernel maintainers can also see it. Btw, do you have
> any other workloads you would suggest to have a try?

No, not really; I guess it's best for VMs that are either idle or have
lots of spare RAM.

> c) This feature is requested by many customers (e.g. general cloud vendors).
> It's for general use cases. As long as the guest has free memory, it will
> benefit from this optimization when doing migration. It's not specific for
> NFV usages, but for sure NFV will also benefit from this feature if we think
> about service chaining, where multiple VMs need to co-work with each other.
> In that case, migrating one VM will just break the working model, which
> means we will need to migrate all the VMs. A shorter migration time will be
> very helpful.

I thought of NFV because their VMs tend to have lots of extra RAM but
most seems unused most of the time.

Dave

> 
> Best,
> Wei
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
  2018-02-06 11:08   ` [virtio-dev] " Wei Wang
  (?)
  (?)
@ 2018-02-09 11:50   ` Dr. David Alan Gilbert
  2018-02-26  5:07     ` Wei Wang
  -1 siblings, 1 reply; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-09 11:50 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, mst, quintela, pbonzini, liliang.opensource,
	yang.zhang.wz, quan.xu0, nilal

* Wei Wang (wei.w.wang@intel.com) wrote:
> Use the free page reporting feature from the balloon device to clear the
> bits corresponding to guest free pages from the dirty bitmap, so that the
> free memory are not sent.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> ---
>  migration/ram.c | 24 ++++++++++++++++++++----
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index d6f462c..4fe16d2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -49,6 +49,7 @@
>  #include "qemu/rcu_queue.h"
>  #include "migration/colo.h"
>  #include "migration/block.h"
> +#include "sysemu/balloon.h"
>  
>  /***********************************************************/
>  /* ram save/restore */
> @@ -206,6 +207,10 @@ struct RAMState {
>      uint32_t last_version;
>      /* We are in the first round */
>      bool ram_bulk_stage;
> +    /* The feature, skipping the transfer of free pages, is supported */
> +    bool free_page_support;
> +    /* Skip the transfer of free pages in the bulk stage */
> +    bool free_page_done;
>      /* How many times we have dirty too many pages */
>      int dirty_rate_high_cnt;
>      /* these variables are used for bitmap sync */
> @@ -773,7 +778,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>      unsigned long *bitmap = rb->bmap;
>      unsigned long next;
>  
> -    if (rs->ram_bulk_stage && start > 0) {
> +    if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
>          next = start + 1;
>      } else {
>          next = find_next_bit(bitmap, size, start);
> @@ -1653,6 +1658,8 @@ static void ram_state_reset(RAMState *rs)
>      rs->last_page = 0;
>      rs->last_version = ram_list.version;
>      rs->ram_bulk_stage = true;
> +    rs->free_page_support = balloon_free_page_support();
> +    rs->free_page_done = false;
>  }
>  
>  #define MAX_WAIT 50 /* ms, half buffered_file limit */
> @@ -2135,7 +2142,7 @@ static int ram_state_init(RAMState **rsp)
>      return 0;
>  }
>  
> -static void ram_list_init_bitmaps(void)
> +static void ram_list_init_bitmaps(RAMState *rs)
>  {
>      RAMBlock *block;
>      unsigned long pages;
> @@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
>          QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>              pages = block->max_length >> TARGET_PAGE_BITS;
>              block->bmap = bitmap_new(pages);
> -            bitmap_set(block->bmap, 0, pages);
> +            if (rs->free_page_support) {
> +                bitmap_set(block->bmap, 1, pages);

I don't understand how it makes sense to do that here;
ignoring anything ese it means that migration_dirty_pages is wrong
which could end up with migration finishing before all real pages are
sent.

Dave

> +            } else {
> +                bitmap_set(block->bmap, 0, pages);
> +            }
>              if (migrate_postcopy_ram()) {
>                  block->unsentmap = bitmap_new(pages);
>                  bitmap_set(block->unsentmap, 0, pages);
> @@ -2161,7 +2172,7 @@ static void ram_init_bitmaps(RAMState *rs)
>      qemu_mutex_lock_ramlist();
>      rcu_read_lock();
>  
> -    ram_list_init_bitmaps();
> +    ram_list_init_bitmaps(rs);
>      memory_global_dirty_log_start();
>      migration_bitmap_sync(rs);
>  
> @@ -2275,6 +2286,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  
>      ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>  
> +    if (rs->free_page_support && !rs->free_page_done) {
> +        balloon_free_page_poll();
> +        rs->free_page_done = true;
> +    }
> +
>      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      i = 0;
>      while ((ret = qemu_file_rate_limit(f)) == 0) {
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  2018-02-06 11:08   ` [virtio-dev] " Wei Wang
  (?)
  (?)
@ 2018-02-09 12:06   ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-09 12:06 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, virtio-dev, mst, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

* Wei Wang (wei.w.wang@intel.com) wrote:

<snip>

> @@ -374,6 +459,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>      f |= dev->host_features;
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);

I agree with Michael that has to be tied to a compatibility flag
somewhere; otherwise we'd hit problems with migration to older QEMUs of
newer guests.

> diff --git a/migration/ram.c b/migration/ram.c
> index cb1950f..d6f462c 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2186,6 +2186,16 @@ static int ram_init_all(RAMState **rsp)
>      return 0;
>  }
>  
> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
> +                                       size_t len)
> +{
> +    long start = offset >> TARGET_PAGE_BITS,
> +         nr = len >> TARGET_PAGE_BITS;
> +
> +    bitmap_clear(block->bmap, start, nr);
> +    ram_state->migration_dirty_pages -= nr;
> +}
> +

I don't think this will work for postcopy; just not sending a page will
mean that the guest will block in userfault waiting for the page
(then it will request it but it wont be sent because the source already
thinks it's clean).

I think you're going to need to make the 'unused' pages be sent as zero
pages; probably by modifying save_zero_page to check a bitmap.

(I'm not sure how that will interact with hugepages)

Dave

>  /*
>   * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>   * long-running RCU critical section.  When rcu-reclaims in the code
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-06 11:08   ` [virtio-dev] " Wei Wang
  (?)
  (?)
@ 2018-02-09 12:15   ` Dr. David Alan Gilbert
  2018-02-26  4:35     ` Wei Wang
  -1 siblings, 1 reply; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-09 12:15 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, mst, quintela, pbonzini, liliang.opensource,
	yang.zhang.wz, quan.xu0, nilal

* Wei Wang (wei.w.wang@intel.com) wrote:
> This patch adds a timer to limit the time that host waits for the free
> page hints reported by the guest. Users can specify the time in ms via
> "free-page-wait-time" command line option. If a user doesn't specify a
> time, host waits till the guest finishes reporting all the free page
> hints. The policy (wait for all the free page hints to be reported or
> use a time limit) is determined by the orchestration layer.

That's kind of a get-out; but there's at least two problems:
   a) With a timeout of 0 (the default) we might hang forever waiting
      for the guest; broken guests are just too common, we can't do
      that.
   b) Even if we were going to do that, you'd have to make sure that
      migrate_cancel provided a way out.
   c) How does that work during a savevm snapshot or when the guest is
      stopped?
   d) OK, the timer gives us some safety (except c); but how does the
      orchestration layer ever come up with a 'safe' value for it?
      Unless we can suggest a safe value that the orchestration layer
      can use, or a way they can work it out, then they just wont use
      it.

Dave


> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/virtio/virtio-balloon.c         | 84 +++++++++++++++++++++++++++++++++++++-
>  hw/virtio/virtio-pci.c             |  3 ++
>  include/hw/virtio/virtio-balloon.h |  4 ++
>  3 files changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index b424d4e..9ee0de4 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -207,6 +207,65 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
>      balloon_stats_change_timer(s, 0);
>  }
>  
> +static void balloon_free_page_change_timer(VirtIOBalloon *s, int64_t ms)
> +{
> +    timer_mod(s->free_page_timer,
> +              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + ms);
> +}
> +
> +static void balloon_stop_free_page_report(void *opaque)
> +{
> +    VirtIOBalloon *dev = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +
> +    timer_del(dev->free_page_timer);
> +    timer_free(dev->free_page_timer);
> +    dev->free_page_timer = NULL;
> +
> +    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
> +        dev->host_stop_free_page = true;
> +        virtio_notify_config(vdev);
> +    }
> +}
> +
> +static void balloon_free_page_get_wait_time(Object *obj, Visitor *v,
> +                                            const char *name, void *opaque,
> +                                            Error **errp)
> +{
> +    VirtIOBalloon *s = opaque;
> +
> +    visit_type_int(v, name, &s->free_page_wait_time, errp);
> +}
> +
> +static void balloon_free_page_set_wait_time(Object *obj, Visitor *v,
> +                                            const char *name, void *opaque,
> +                                            Error **errp)
> +{
> +    VirtIOBalloon *s = opaque;
> +    Error *local_err = NULL;
> +    int64_t value;
> +
> +    visit_type_int(v, name, &value, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    if (value < 0) {
> +        error_setg(errp, "free page wait time must be greater than zero");
> +        return;
> +    }
> +
> +    if (value > UINT32_MAX) {
> +        error_setg(errp, "free page wait time value is too big");
> +        return;
> +    }
> +
> +    s->free_page_wait_time = value;
> +    g_assert(s->free_page_timer == NULL);
> +    s->free_page_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
> +                                      balloon_stop_free_page_report, s);
> +}
> +
>  static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>  {
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> @@ -330,6 +389,7 @@ static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
>              if (id == dev->free_page_report_cmd_id) {
>                  dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
>              } else {
> +                dev->host_stop_free_page = false;
>                  dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>                  break;
>              }
> @@ -385,6 +445,10 @@ static void virtio_balloon_free_page_poll(void *opaque)
>      virtio_notify_config(vdev);
>      s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
>  
> +    if (s->free_page_wait_time) {
> +        balloon_free_page_change_timer(s, s->free_page_wait_time);
> +    }
> +
>      virtio_balloon_poll_free_page_hints(s);
>  }
>  
> @@ -395,7 +459,19 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  
>      config.num_pages = cpu_to_le32(dev->num_pages);
>      config.actual = cpu_to_le32(dev->actual);
> -    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
> +    if (dev->host_stop_free_page) {
> +        /*
> +         * Host is actively requesting to stop the free page reporting, send
> +         * the stop sign to the guest. This happens when the migration thread
> +         * has reached the phase to send pages to the destination while the
> +         * guest hasn't done the reporting.
> +         */
> +        config.free_page_report_cmd_id =
> +                                    VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +    } else {
> +        config.free_page_report_cmd_id =
> +                                    cpu_to_le32(dev->free_page_report_cmd_id);
> +    }
>  
>      trace_virtio_balloon_get_config(config.num_pages, config.actual);
>      memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
> @@ -539,6 +615,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>          s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
>          s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>          s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +        s->host_stop_free_page = false;
>      }
>      reset_stats(s);
>  }
> @@ -602,6 +679,11 @@ static void virtio_balloon_instance_init(Object *obj)
>                          balloon_stats_get_poll_interval,
>                          balloon_stats_set_poll_interval,
>                          NULL, s, NULL);
> +
> +    object_property_add(obj, "free-page-wait-time", "int",
> +                        balloon_free_page_get_wait_time,
> +                        balloon_free_page_set_wait_time,
> +                        NULL, s, NULL);
>  }
>  
>  static const VMStateDescription vmstate_virtio_balloon = {
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 6c75cca..3345104 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -2250,6 +2250,9 @@ static void virtio_balloon_pci_instance_init(Object *obj)
>      object_property_add_alias(obj, "guest-stats-polling-interval",
>                                OBJECT(&dev->vdev),
>                                "guest-stats-polling-interval", &error_abort);
> +    object_property_add_alias(obj, "free-page-wait-time",
> +                              OBJECT(&dev->vdev),
> +                              "free-page-wait-time", &error_abort);
>  }
>  
>  static const TypeInfo virtio_balloon_pci_info = {
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 11b4e01..c16855b 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -40,6 +40,8 @@ enum virtio_balloon_free_page_report_status {
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
>      VirtQueue *ivq, *dvq, *svq, *free_page_vq;
> +    /* Host is requesting the guest to stop free page reporting */
> +    bool host_stop_free_page;
>      uint32_t free_page_report_status;
>      uint32_t num_pages;
>      uint32_t actual;
> @@ -49,8 +51,10 @@ typedef struct VirtIOBalloon {
>      VirtQueueElement *stats_vq_elem;
>      size_t stats_vq_offset;
>      QEMUTimer *stats_timer;
> +    QEMUTimer *free_page_timer;
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
> +    int64_t free_page_wait_time;
>      uint32_t host_features;
>  } VirtIOBalloon;
>  
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-09 12:15   ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2018-02-26  4:35     ` Wei Wang
  2018-02-27  0:50       ` Michael S. Tsirkin
  2018-02-27 10:34       ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-26  4:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, mst, quintela, pbonzini, liliang.opensource,
	yang.zhang.wz, quan.xu0, nilal

On 02/09/2018 08:15 PM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> This patch adds a timer to limit the time that host waits for the free
>> page hints reported by the guest. Users can specify the time in ms via
>> "free-page-wait-time" command line option. If a user doesn't specify a
>> time, host waits till the guest finishes reporting all the free page
>> hints. The policy (wait for all the free page hints to be reported or
>> use a time limit) is determined by the orchestration layer.
> That's kind of a get-out; but there's at least two problems:
>     a) With a timeout of 0 (the default) we might hang forever waiting
>        for the guest; broken guests are just too common, we can't do
>        that.
>     b) Even if we were going to do that, you'd have to make sure that
>        migrate_cancel provided a way out.
>     c) How does that work during a savevm snapshot or when the guest is
>        stopped?
>     d) OK, the timer gives us some safety (except c); but how does the
>        orchestration layer ever come up with a 'safe' value for it?
>        Unless we can suggest a safe value that the orchestration layer
>        can use, or a way they can work it out, then they just wont use
>        it.
>

Hi Dave,

Sorry for my late response. Please see below:

a) I think people would just kill the guest if it is broken. We can also 
change the default timeout value, for example 1 second, which is enough 
for the free page reporting.

b) How about changing it this way: if timeout happens, host sends a stop 
command to the guest, and makes virtio_balloon_poll_free_page_hints() 
"return" immediately (without getting the guest's acknowledge). The 
"return" basically goes back to the migration_thread function:
while (s->state == MIGRATION_STATUS_ACTIVE ||
            s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
...
}

migration_cancel sets the state to MIGRATION_CANCELLING, so it will stop 
the migration process.

c) This optimization needs the guest to report. If the guest is stopped, 
it wouldn't work. How about adding a check of "RUN_STATE" before going 
into the optimization?

d) Yes. Normally it is faster to wait for the guest to report all the 
free pages. Probably, we can just hardcode a value (e.g. 1s) for now 
(instead of making it configurable by users), this is used to handle the 
case that the guest is broken. What would you think?

Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support
  2018-02-09 10:53     ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2018-02-26  4:42         ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-26  4:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, virtio-dev, mst, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/09/2018 06:53 PM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
>>> * Wei Wang (wei.w.wang@intel.com) wrote:
>>>> This is the deivce part implementation to add a new feature,
>>>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>>>> receives the guest free page hints from the driver and clears the
>>>> corresponding bits in the dirty bitmap, so that those free pages are
>>>> not transferred by the migration thread to the destination.
>>>>
>>>> Please see the driver patch link for test results:
>>>> https://lkml.org/lkml/2018/2/4/60
>>> Hi Wei,
>>>      I'll look at the code a bit more - but first some more basic
>>> questions on that lkml post:
>>>
>>>       a) The idle guest time thing is a nice result; can you just state
>>>          what the host was, speed of connection, and what other options
>>>          you were using?
>>>
>>>       b) The workload test, the one with the kernel compile; you list
>>>          the kernel compile time but don't mention any changes in the
>>>          migration times of the ping-pong; can you give those times as
>>>          well?
>>>
>>>       c) What's your real workload that this is aimed at?
>>>          Is it really for people migrating idle VMs - or do you have some
>>>          NFV application in mind, if so why not include a figure for
>>>          those?
>>>
>> Hi Dave,
>>
>> Thanks for joining the review. Please see below info.
>>
>> a) Environment info
>>      - Host:
>>          - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>>          - kernel: 3.10.0
>>
>>      - Guest:
>>          - kernel: 4.15.0
>>          - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G
>> --mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true
>>
>>      - Migration setup:
>>          - migrate_set_speed 0
>>          - migrate_set_downtime 0.01  (10ms)
> That's an unusually low downtime (and I'm not sure what setting the
> speed to 0 does!).

For idle guest tests, I used 0.01s downtime. If we run workloads, we can 
change it to 2s. Just make sure we set the same downtime for both legacy 
and optimization cases so that we can have an apple to apple comparison.

speed being set to 0 means using the largest bandwidth. Here it is 
effectively the same as setting it to 100G.

>
>> b) Michael asked the same question on the kernel patches, I'll reply there
>> with you cc-ed, so that kernel maintainers can also see it. Btw, do you have
>> any other workloads you would suggest to have a try?
> No, not really; I guess it's best for VMs that are either idle or have
> lots of spare RAM.

Yes, less free memory results in less improvement (please see the more 
results with the linux compilation workload shared in LKML).


Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 0/3] virtio-balloon: free page hint reporting support
@ 2018-02-26  4:42         ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-26  4:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, virtio-dev, mst, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/09/2018 06:53 PM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
>>> * Wei Wang (wei.w.wang@intel.com) wrote:
>>>> This is the deivce part implementation to add a new feature,
>>>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>>>> receives the guest free page hints from the driver and clears the
>>>> corresponding bits in the dirty bitmap, so that those free pages are
>>>> not transferred by the migration thread to the destination.
>>>>
>>>> Please see the driver patch link for test results:
>>>> https://lkml.org/lkml/2018/2/4/60
>>> Hi Wei,
>>>      I'll look at the code a bit more - but first some more basic
>>> questions on that lkml post:
>>>
>>>       a) The idle guest time thing is a nice result; can you just state
>>>          what the host was, speed of connection, and what other options
>>>          you were using?
>>>
>>>       b) The workload test, the one with the kernel compile; you list
>>>          the kernel compile time but don't mention any changes in the
>>>          migration times of the ping-pong; can you give those times as
>>>          well?
>>>
>>>       c) What's your real workload that this is aimed at?
>>>          Is it really for people migrating idle VMs - or do you have some
>>>          NFV application in mind, if so why not include a figure for
>>>          those?
>>>
>> Hi Dave,
>>
>> Thanks for joining the review. Please see below info.
>>
>> a) Environment info
>>      - Host:
>>          - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>>          - kernel: 3.10.0
>>
>>      - Guest:
>>          - kernel: 4.15.0
>>          - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G
>> --mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true
>>
>>      - Migration setup:
>>          - migrate_set_speed 0
>>          - migrate_set_downtime 0.01  (10ms)
> That's an unusually low downtime (and I'm not sure what setting the
> speed to 0 does!).

For idle guest tests, I used 0.01s downtime. If we run workloads, we can 
change it to 2s. Just make sure we set the same downtime for both legacy 
and optimization cases so that we can have an apple to apple comparison.

speed being set to 0 means using the largest bandwidth. Here it is 
effectively the same as setting it to 100G.

>
>> b) Michael asked the same question on the kernel patches, I'll reply there
>> with you cc-ed, so that kernel maintainers can also see it. Btw, do you have
>> any other workloads you would suggest to have a try?
> No, not really; I guess it's best for VMs that are either idle or have
> lots of spare RAM.

Yes, less free memory results in less improvement (please see the more 
results with the linux compilation workload shared in LKML).


Best,
Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
  2018-02-09 11:50   ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2018-02-26  5:07     ` Wei Wang
  2018-02-26  9:22       ` Wang, Wei W
  0 siblings, 1 reply; 40+ messages in thread
From: Wei Wang @ 2018-02-26  5:07 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, mst, quintela, pbonzini, liliang.opensource,
	yang.zhang.wz, quan.xu0, nilal

On 02/09/2018 07:50 PM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> Use the free page reporting feature from the balloon device to clear the
>> bits corresponding to guest free pages from the dirty bitmap, so that the
>> free memory are not sent.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> CC: Michael S. Tsirkin <mst@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> ---
>>   migration/ram.c | 24 ++++++++++++++++++++----
>>   1 file changed, 20 insertions(+), 4 deletions(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index d6f462c..4fe16d2 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -49,6 +49,7 @@
>>   #include "qemu/rcu_queue.h"
>>   #include "migration/colo.h"
>>   #include "migration/block.h"
>> +#include "sysemu/balloon.h"
>>   
>>   /***********************************************************/
>>   /* ram save/restore */
>> @@ -206,6 +207,10 @@ struct RAMState {
>>       uint32_t last_version;
>>       /* We are in the first round */
>>       bool ram_bulk_stage;
>> +    /* The feature, skipping the transfer of free pages, is supported */
>> +    bool free_page_support;
>> +    /* Skip the transfer of free pages in the bulk stage */
>> +    bool free_page_done;
>>       /* How many times we have dirty too many pages */
>>       int dirty_rate_high_cnt;
>>       /* these variables are used for bitmap sync */
>> @@ -773,7 +778,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>>       unsigned long *bitmap = rb->bmap;
>>       unsigned long next;
>>   
>> -    if (rs->ram_bulk_stage && start > 0) {
>> +    if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
>>           next = start + 1;
>>       } else {
>>           next = find_next_bit(bitmap, size, start);
>> @@ -1653,6 +1658,8 @@ static void ram_state_reset(RAMState *rs)
>>       rs->last_page = 0;
>>       rs->last_version = ram_list.version;
>>       rs->ram_bulk_stage = true;
>> +    rs->free_page_support = balloon_free_page_support();
>> +    rs->free_page_done = false;
>>   }
>>   
>>   #define MAX_WAIT 50 /* ms, half buffered_file limit */
>> @@ -2135,7 +2142,7 @@ static int ram_state_init(RAMState **rsp)
>>       return 0;
>>   }
>>   
>> -static void ram_list_init_bitmaps(void)
>> +static void ram_list_init_bitmaps(RAMState *rs)
>>   {
>>       RAMBlock *block;
>>       unsigned long pages;
>> @@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
>>           QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>               pages = block->max_length >> TARGET_PAGE_BITS;
>>               block->bmap = bitmap_new(pages);
>> -            bitmap_set(block->bmap, 0, pages);
>> +            if (rs->free_page_support) {
>> +                bitmap_set(block->bmap, 1, pages);
> I don't understand how it makes sense to do that here;
> ignoring anything ese it means that migration_dirty_pages is wrong
> which could end up with migration finishing before all real pages are
> sent.
>

The bulk stage treats all the pages as dirty pages, so we set all the 
bits to "1", this is needed by this optimization feature, because the 
free pages reported from the guest can then be directly cleared from the 
bitmap (we don't need any more bitmaps to record free pages).

Why is migration_dirty_pages incorrect?

Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
  2018-02-26  5:07     ` Wei Wang
@ 2018-02-26  9:22       ` Wang, Wei W
  0 siblings, 0 replies; 40+ messages in thread
From: Wang, Wei W @ 2018-02-26  9:22 UTC (permalink / raw)
  To: Wang, Wei W, Dr. David Alan Gilbert
  Cc: yang.zhang.wz, quan.xu0, quintela, liliang.opensource, mst,
	qemu-devel, pbonzini, nilal

On Monday, February 26, 2018 1:07 PM, Wei Wang wrote:
> On 02/09/2018 07:50 PM, Dr. David Alan Gilbert wrote:
> > * Wei Wang (wei.w.wang@intel.com) wrote:
> >> Use the free page reporting feature from the balloon device to clear
> >> the bits corresponding to guest free pages from the dirty bitmap, so
> >> that the free memory are not sent.
> >>
> >> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> >> CC: Michael S. Tsirkin <mst@redhat.com>
> >> CC: Juan Quintela <quintela@redhat.com>
> >> ---
> >>   migration/ram.c | 24 ++++++++++++++++++++----
> >>   1 file changed, 20 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/migration/ram.c b/migration/ram.c index d6f462c..4fe16d2
> >> 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -49,6 +49,7 @@
> >>   #include "qemu/rcu_queue.h"
> >>   #include "migration/colo.h"
> >>   #include "migration/block.h"
> >> +#include "sysemu/balloon.h"
> >>
> >>   /***********************************************************/
> >>   /* ram save/restore */
> >> @@ -206,6 +207,10 @@ struct RAMState {
> >>       uint32_t last_version;
> >>       /* We are in the first round */
> >>       bool ram_bulk_stage;
> >> +    /* The feature, skipping the transfer of free pages, is supported */
> >> +    bool free_page_support;
> >> +    /* Skip the transfer of free pages in the bulk stage */
> >> +    bool free_page_done;
> >>       /* How many times we have dirty too many pages */
> >>       int dirty_rate_high_cnt;
> >>       /* these variables are used for bitmap sync */ @@ -773,7 +778,7
> >> @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock
> *rb,
> >>       unsigned long *bitmap = rb->bmap;
> >>       unsigned long next;
> >>
> >> -    if (rs->ram_bulk_stage && start > 0) {
> >> +    if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
> >>           next = start + 1;
> >>       } else {
> >>           next = find_next_bit(bitmap, size, start); @@ -1653,6
> >> +1658,8 @@ static void ram_state_reset(RAMState *rs)
> >>       rs->last_page = 0;
> >>       rs->last_version = ram_list.version;
> >>       rs->ram_bulk_stage = true;
> >> +    rs->free_page_support = balloon_free_page_support();
> >> +    rs->free_page_done = false;
> >>   }
> >>
> >>   #define MAX_WAIT 50 /* ms, half buffered_file limit */ @@ -2135,7
> >> +2142,7 @@ static int ram_state_init(RAMState **rsp)
> >>       return 0;
> >>   }
> >>
> >> -static void ram_list_init_bitmaps(void)
> >> +static void ram_list_init_bitmaps(RAMState *rs)
> >>   {
> >>       RAMBlock *block;
> >>       unsigned long pages;
> >> @@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
> >>           QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >>               pages = block->max_length >> TARGET_PAGE_BITS;
> >>               block->bmap = bitmap_new(pages);
> >> -            bitmap_set(block->bmap, 0, pages);
> >> +            if (rs->free_page_support) {
> >> +                bitmap_set(block->bmap, 1, pages);
> > I don't understand how it makes sense to do that here; ignoring
> > anything ese it means that migration_dirty_pages is wrong which could
> > end up with migration finishing before all real pages are sent.
> >
> 
> The bulk stage treats all the pages as dirty pages, so we set all the bits to "1",
> this is needed by this optimization feature, because the free pages reported
> from the guest can then be directly cleared from the bitmap (we don't need
> any more bitmaps to record free pages).
> 

Sorry, there was a misunderstanding of the bitmap_set API (thought it was used to set all the bits to 1 or 0). So the above change isn't needed actually. Btw, this doesn't affect the results I reported.

Best,
Wei
 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-26  4:35     ` Wei Wang
@ 2018-02-27  0:50       ` Michael S. Tsirkin
  2018-02-27 10:10         ` Wei Wang
  2018-02-27 10:34       ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2018-02-27  0:50 UTC (permalink / raw)
  To: Wei Wang
  Cc: Dr. David Alan Gilbert, qemu-devel, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal

On Mon, Feb 26, 2018 at 12:35:31PM +0800, Wei Wang wrote:
> On 02/09/2018 08:15 PM, Dr. David Alan Gilbert wrote:
> > * Wei Wang (wei.w.wang@intel.com) wrote:
> > > This patch adds a timer to limit the time that host waits for the free
> > > page hints reported by the guest. Users can specify the time in ms via
> > > "free-page-wait-time" command line option. If a user doesn't specify a
> > > time, host waits till the guest finishes reporting all the free page
> > > hints. The policy (wait for all the free page hints to be reported or
> > > use a time limit) is determined by the orchestration layer.
> > That's kind of a get-out; but there's at least two problems:
> >     a) With a timeout of 0 (the default) we might hang forever waiting
> >        for the guest; broken guests are just too common, we can't do
> >        that.
> >     b) Even if we were going to do that, you'd have to make sure that
> >        migrate_cancel provided a way out.
> >     c) How does that work during a savevm snapshot or when the guest is
> >        stopped?
> >     d) OK, the timer gives us some safety (except c); but how does the
> >        orchestration layer ever come up with a 'safe' value for it?
> >        Unless we can suggest a safe value that the orchestration layer
> >        can use, or a way they can work it out, then they just wont use
> >        it.
> > 
> 
> Hi Dave,
> 
> Sorry for my late response. Please see below:
> 
> a) I think people would just kill the guest if it is broken.

There's no way to know whether it's broken.

> We can also
> change the default timeout value, for example 1 second, which is enough for
> the free page reporting.

There's no way to know whether it's enough.

> b) How about changing it this way: if timeout happens, host sends a stop
> command to the guest, and makes virtio_balloon_poll_free_page_hints()
> "return" immediately (without getting the guest's acknowledge). The "return"
> basically goes back to the migration_thread function:
> while (s->state == MIGRATION_STATUS_ACTIVE ||
>            s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> ...
> }
> 
> migration_cancel sets the state to MIGRATION_CANCELLING, so it will stop the
> migration process.
> 
> c) This optimization needs the guest to report. If the guest is stopped, it
> wouldn't work. How about adding a check of "RUN_STATE" before going into the
> optimization?
> 
> d) Yes. Normally it is faster to wait for the guest to report all the free
> pages. Probably, we can just hardcode a value (e.g. 1s) for now (instead of
> making it configurable by users), this is used to handle the case that the
> guest is broken. What would you think?
> 
> Best,
> Wei

I think all this is premature optimization. It is not at all clear that
anything is gained by delaying migration. Just ask for hints and start
sending pages immediately.  If guest tells us a page is free before it's
sent, we can skip sending it.  OTOH if migration is taking less time to
complete than it takes for guest to respond, then we are better off just
ignoring the hint.

-- 
MST

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-27  0:50       ` Michael S. Tsirkin
@ 2018-02-27 10:10         ` Wei Wang
  2018-02-27 13:08           ` Liang Li
  0 siblings, 1 reply; 40+ messages in thread
From: Wei Wang @ 2018-02-27 10:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dr. David Alan Gilbert, qemu-devel, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal

On 02/27/2018 08:50 AM, Michael S. Tsirkin wrote:
> On Mon, Feb 26, 2018 at 12:35:31PM +0800, Wei Wang wrote:
>> On 02/09/2018 08:15 PM, Dr. David Alan Gilbert wrote:
>>> * Wei Wang (wei.w.wang@intel.com) wrote:
>
> I think all this is premature optimization. It is not at all clear that
> anything is gained by delaying migration. Just ask for hints and start
> sending pages immediately.  If guest tells us a page is free before it's
> sent, we can skip sending it.  OTOH if migration is taking less time to
> complete than it takes for guest to respond, then we are better off just
> ignoring the hint.

OK, I'll try to create a thread for the free page optimization. We 
create the thread to poll for free pages at the beginning of the bulk 
stage, and stops at the end of bulk stage.
There are also comments about postcopy support with this feature, I plan 
to leave that as the second step (that support seems not urgent for now).


Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-26  4:35     ` Wei Wang
  2018-02-27  0:50       ` Michael S. Tsirkin
@ 2018-02-27 10:34       ` Dr. David Alan Gilbert
  2018-02-28 10:37         ` Wei Wang
  1 sibling, 1 reply; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2018-02-27 10:34 UTC (permalink / raw)
  To: Wei Wang
  Cc: qemu-devel, mst, quintela, pbonzini, liliang.opensource,
	yang.zhang.wz, quan.xu0, nilal

* Wei Wang (wei.w.wang@intel.com) wrote:
> On 02/09/2018 08:15 PM, Dr. David Alan Gilbert wrote:
> > * Wei Wang (wei.w.wang@intel.com) wrote:
> > > This patch adds a timer to limit the time that host waits for the free
> > > page hints reported by the guest. Users can specify the time in ms via
> > > "free-page-wait-time" command line option. If a user doesn't specify a
> > > time, host waits till the guest finishes reporting all the free page
> > > hints. The policy (wait for all the free page hints to be reported or
> > > use a time limit) is determined by the orchestration layer.
> > That's kind of a get-out; but there's at least two problems:
> >     a) With a timeout of 0 (the default) we might hang forever waiting
> >        for the guest; broken guests are just too common, we can't do
> >        that.
> >     b) Even if we were going to do that, you'd have to make sure that
> >        migrate_cancel provided a way out.
> >     c) How does that work during a savevm snapshot or when the guest is
> >        stopped?
> >     d) OK, the timer gives us some safety (except c); but how does the
> >        orchestration layer ever come up with a 'safe' value for it?
> >        Unless we can suggest a safe value that the orchestration layer
> >        can use, or a way they can work it out, then they just wont use
> >        it.
> > 
> 
> Hi Dave,
> 
> Sorry for my late response. Please see below:
> 
> a) I think people would just kill the guest if it is broken. We can also
> change the default timeout value, for example 1 second, which is enough for
> the free page reporting.

Remember that many VMs are automatically migrated without their being a
human involved; those VMs might be in the BIOS or Grub or shutting down at
the time of migration; there's no human to look at the VM.

> b) How about changing it this way: if timeout happens, host sends a stop
> command to the guest, and makes virtio_balloon_poll_free_page_hints()
> "return" immediately (without getting the guest's acknowledge). The "return"
> basically goes back to the migration_thread function:
> while (s->state == MIGRATION_STATUS_ACTIVE ||
>            s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> ...
> }
> 
> migration_cancel sets the state to MIGRATION_CANCELLING, so it will stop the
> migration process.

OK, but htat does rely on there being a timeout; it means you can't have
the default no-timeout because then you can't cancel.

> c) This optimization needs the guest to report. If the guest is stopped, it
> wouldn't work. How about adding a check of "RUN_STATE" before going into the
> optimization?

Yes, that's OK.

> d) Yes. Normally it is faster to wait for the guest to report all the free
> pages. Probably, we can just hardcode a value (e.g. 1s) for now (instead of
> making it configurable by users), this is used to handle the case that the
> guest is broken. What would you think?

The issue is not about configurability - the issue is that it's
hard/impossible to find a good value for the timeout.

Dave

> Best,
> Wei
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-27 10:10         ` Wei Wang
@ 2018-02-27 13:08           ` Liang Li
  2018-02-28 10:33             ` Wei Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Liang Li @ 2018-02-27 13:08 UTC (permalink / raw)
  To: Wei Wang
  Cc: Dr. David Alan Gilbert, qemu-devel, quintela, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, mst

On Tue, Feb 27, 2018 at 06:10:47PM +0800, Wei Wang wrote:
> On 02/27/2018 08:50 AM, Michael S. Tsirkin wrote:
> > On Mon, Feb 26, 2018 at 12:35:31PM +0800, Wei Wang wrote:
> > > On 02/09/2018 08:15 PM, Dr. David Alan Gilbert wrote:
> > > > * Wei Wang (wei.w.wang@intel.com) wrote:
> > 
> > I think all this is premature optimization. It is not at all clear that
> > anything is gained by delaying migration. Just ask for hints and start
> > sending pages immediately.  If guest tells us a page is free before it's
> > sent, we can skip sending it.  OTOH if migration is taking less time to
> > complete than it takes for guest to respond, then we are better off just
> > ignoring the hint.
> 
> OK, I'll try to create a thread for the free page optimization. We create
> the thread to poll for free pages at the beginning of the bulk stage, and
> stops at the end of bulk stage.
> There are also comments about postcopy support with this feature, I plan to
> leave that as the second step (that support seems not urgent for now).
> 
> 
> Best,
> Wei

you can make use the current migration thread instead of creating a new one.

Liang

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-27 13:08           ` Liang Li
@ 2018-02-28 10:33             ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-28 10:33 UTC (permalink / raw)
  To: Liang Li
  Cc: Dr. David Alan Gilbert, qemu-devel, quintela, pbonzini,
	yang.zhang.wz, quan.xu0, nilal, mst

On 02/27/2018 09:08 PM, Liang Li wrote:
> On Tue, Feb 27, 2018 at 06:10:47PM +0800, Wei Wang wrote:
>> On 02/27/2018 08:50 AM, Michael S. Tsirkin wrote:
>>> On Mon, Feb 26, 2018 at 12:35:31PM +0800, Wei Wang wrote:
>>>> On 02/09/2018 08:15 PM, Dr. David Alan Gilbert wrote:
>>>>> * Wei Wang (wei.w.wang@intel.com) wrote:
>>> I think all this is premature optimization. It is not at all clear that
>>> anything is gained by delaying migration. Just ask for hints and start
>>> sending pages immediately.  If guest tells us a page is free before it's
>>> sent, we can skip sending it.  OTOH if migration is taking less time to
>>> complete than it takes for guest to respond, then we are better off just
>>> ignoring the hint.
>> OK, I'll try to create a thread for the free page optimization. We create
>> the thread to poll for free pages at the beginning of the bulk stage, and
>> stops at the end of bulk stage.
>> There are also comments about postcopy support with this feature, I plan to
>> leave that as the second step (that support seems not urgent for now).
>>
>>
>> Best,
>> Wei
> you can make use the current migration thread instead of creating a new one.
>

This is what this version is doing - we make the optimization 
implementation be part of the migration thread. To make the optimization 
go in parallel with the the migration thread, we need another thread for 
the optimization.

Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time
  2018-02-27 10:34       ` Dr. David Alan Gilbert
@ 2018-02-28 10:37         ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-02-28 10:37 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, mst, quintela, pbonzini, liliang.opensource,
	yang.zhang.wz, quan.xu0, nilal

On 02/27/2018 06:34 PM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> On 02/09/2018 08:15 PM, Dr. David Alan Gilbert wrote:
>>> * Wei Wang (wei.w.wang@intel.com) wrote:
>>>> This patch adds a timer to limit the time that host waits for the free
>>>> page hints reported by the guest. Users can specify the time in ms via
>>>> "free-page-wait-time" command line option. If a user doesn't specify a
>>>> time, host waits till the guest finishes reporting all the free page
>>>> hints. The policy (wait for all the free page hints to be reported or
>>>> use a time limit) is determined by the orchestration layer.
>>> That's kind of a get-out; but there's at least two problems:
>>>      a) With a timeout of 0 (the default) we might hang forever waiting
>>>         for the guest; broken guests are just too common, we can't do
>>>         that.
>>>      b) Even if we were going to do that, you'd have to make sure that
>>>         migrate_cancel provided a way out.
>>>      c) How does that work during a savevm snapshot or when the guest is
>>>         stopped?
>>>      d) OK, the timer gives us some safety (except c); but how does the
>>>         orchestration layer ever come up with a 'safe' value for it?
>>>         Unless we can suggest a safe value that the orchestration layer
>>>         can use, or a way they can work it out, then they just wont use
>>>         it.
>>>
>> Hi Dave,
>>
>> Sorry for my late response. Please see below:
>>
>> a) I think people would just kill the guest if it is broken. We can also
>> change the default timeout value, for example 1 second, which is enough for
>> the free page reporting.
> Remember that many VMs are automatically migrated without their being a
> human involved; those VMs might be in the BIOS or Grub or shutting down at
> the time of migration; there's no human to look at the VM.
>

OK, thanks for the sharing. I plan to take Michael's suggestion to make 
the optimization run in parallel with the migration thread. The 
optimization will be in its own thread, and the migration thread runs as 
usual (not stuck by the optimization e.g. when the optimization part 
doesn't return promptly in any case).

Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  2018-02-07  1:04     ` [virtio-dev] " Michael S. Tsirkin
@ 2018-03-02  9:32       ` Wei Wang
  -1 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-03-02  9:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/07/2018 09:04 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 06, 2018 at 07:08:17PM +0800, Wei Wang wrote:
>> The new feature enables the virtio-balloon device to receive the hint of
>> guest free pages from the free page vq, and clears the corresponding bits
>> of the free page from the dirty bitmap, so that those free pages are not
>> transferred by the migration thread.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> Signed-off-by: Liang Li <liang.z.li@intel.com>
>> CC: Michael S. Tsirkin <mst@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> ---
>>   balloon.c                                       |  39 +++++--
>>   hw/virtio/virtio-balloon.c                      | 145 +++++++++++++++++++++---
>>   include/hw/virtio/virtio-balloon.h              |  11 +-
>>   include/migration/misc.h                        |   3 +
>>   include/standard-headers/linux/virtio_balloon.h |   7 ++
>>   include/sysemu/balloon.h                        |  12 +-
>>   migration/ram.c                                 |  10 ++
>>   7 files changed, 198 insertions(+), 29 deletions(-)
>>
>> diff --git a/balloon.c b/balloon.c
>> index 1d720ff..0f0b30c 100644
>> --- a/balloon.c
>> +++ b/balloon.c
>> @@ -36,6 +36,8 @@
>>   
>>   static QEMUBalloonEvent *balloon_event_fn;
>>   static QEMUBalloonStatus *balloon_stat_fn;
>> +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn;
>> +static QEMUBalloonFreePagePoll *balloon_free_page_poll_fn;
>>   static void *balloon_opaque;
>>   static bool balloon_inhibited;
>>   
>> @@ -64,19 +66,34 @@ static bool have_balloon(Error **errp)
>>       return true;
>>   }
>>   
>> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>> -                             QEMUBalloonStatus *stat_func, void *opaque)
>> +bool balloon_free_page_support(void)
>>   {
>> -    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
>> -        /* We're already registered one balloon handler.  How many can
>> -         * a guest really have?
>> -         */
>> -        return -1;
>> +    return balloon_free_page_support_fn &&
>> +           balloon_free_page_support_fn(balloon_opaque);
>> +}
>> +
>> +void balloon_free_page_poll(void)
>> +{
>> +    balloon_free_page_poll_fn(balloon_opaque);
>> +}
>> +
>> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
>> +                              QEMUBalloonStatus *stat_fn,
>> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
>> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
>> +                              void *opaque)
>> +{
>> +    if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn ||
>> +        balloon_free_page_poll_fn || balloon_opaque) {
>> +        /* We already registered one balloon handler. */
>> +        return;
>>       }
>> -    balloon_event_fn = event_func;
>> -    balloon_stat_fn = stat_func;
>> +
>> +    balloon_event_fn = event_fn;
>> +    balloon_stat_fn = stat_fn;
>> +    balloon_free_page_support_fn = free_page_support_fn;
>> +    balloon_free_page_poll_fn = free_page_poll_fn;
>>       balloon_opaque = opaque;
>> -    return 0;
>>   }
>>   
>>   void qemu_remove_balloon_handler(void *opaque)
>> @@ -86,6 +103,8 @@ void qemu_remove_balloon_handler(void *opaque)
>>       }
>>       balloon_event_fn = NULL;
>>       balloon_stat_fn = NULL;
>> +    balloon_free_page_support_fn = NULL;
>> +    balloon_free_page_poll_fn = NULL;
>>       balloon_opaque = NULL;
>>   }
>>   
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 14e08d2..b424d4e 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -23,6 +23,7 @@
>>   #include "hw/virtio/virtio-balloon.h"
>>   #include "sysemu/kvm.h"
>>   #include "exec/address-spaces.h"
>> +#include "exec/ram_addr.h"
>>   #include "qapi/visitor.h"
>>   #include "qapi-event.h"
>>   #include "trace.h"
>> @@ -30,6 +31,7 @@
>>   
>>   #include "hw/virtio/virtio-bus.h"
>>   #include "hw/virtio/virtio-access.h"
>> +#include "migration/misc.h"
>>   
>>   #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>>   
>> @@ -305,6 +307,87 @@ out:
>>       }
>>   }
>>   
>> +static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
>> +{
>> +    VirtQueueElement *elem;
>> +    VirtQueue *vq = dev->free_page_vq;
>> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>> +    bool page_poisoning = virtio_vdev_has_feature(vdev,
>> +                                              VIRTIO_BALLOON_F_PAGE_POISON);
>> +    uint32_t id;
>> +
>> +    /* Poll the vq till a stop cmd id is received */
>> +    while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
>> +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>> +        if (!elem) {
>> +            continue;
>> +        }
>> +
>> +        if (elem->out_num) {
>> +            iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(uint32_t));
>> +            virtqueue_push(vq, elem, sizeof(id));
>> +            g_free(elem);
>> +            if (id == dev->free_page_report_cmd_id) {
>> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
>> +            } else {
>> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>
> So if migration on source times out, then you migrate to
> another destination, this will cancel the in-progress reporting
> due to while loop above.  Probably not what was intended.
>
>> +                break;
>> +            }
>> +        }
>> +
>> +        if (elem->in_num) {
>> +            RAMBlock *block;
>> +            ram_addr_t offset;
>> +            void *base;
>> +            size_t total_len, len;
>> +
>> +            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START &&
>> +                !page_poisoning) {
> So when poisoning is enabled, you can't skip the pages?
>
> I suspect if poison is 0 you actually can.
>
>
>
>> +                base = elem->in_sg[0].iov_base;
>> +                total_len = elem->in_sg[0].iov_len;
>> +                len = total_len;
> the below list looks like it'd be a better API.
> How about an API that just gives hints to qemu?
> qemu_guest_page_free_hint(start, len)?
>
>> +                while (total_len > 0) {
>> +                    block = qemu_ram_block_from_host(base, false, &offset);
>> +                    if (unlikely(offset + total_len > block->used_length)) {
>> +                        len = block->used_length - offset;
>> +                        base += len;
>> +                    }
>> +                    skip_free_pages_from_dirty_bitmap(block, offset, len);
> For post-copy migration, this does not look like it will DTRT.
>
> Doesn't look like it will DTRT for tcg either.
>
> And generally, this only works as long as you don't call log_sync.
>
> Once you call log_sync, you can't rely on old hints, and you must increment
> command id.
>
> which isn't a good API, in my opinion.
>
>> +                    total_len -= len;
>> +                }
>> +            }
>> +            virtqueue_push(vq, elem, total_len);
>> +            g_free(elem);
>> +        }
>> +    }
>> +}
>> +
>> +static bool virtio_balloon_free_page_support(void *opaque)
>> +{
>> +    VirtIOBalloon *s = opaque;
>> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
>> +
>> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
>> +}
>> +
>> +static void virtio_balloon_free_page_poll(void *opaque)
>> +{
>> +    VirtIOBalloon *s = opaque;
>> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
>> +
>> +    if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) {
>> +        s->free_page_report_cmd_id = 1;
>> +    } else {
>> +        s->free_page_report_cmd_id++;
>> +    }
> I think it is prudent to reserve a range of command IDs that we don't use.
> These can then be used later for more commands.
>
> E.g. how about VIRTIO_BALLOON_FREE_PAGE_REPORT_MIN_ID 0x80000000
> and let the ID go between 0x80000000 and UINT_MAX?
>
> No guest changes needed, but patching uapi in linux to add the enum
> might not be a bad idea.
>
>
>> +
>> +    virtio_notify_config(vdev);
>> +    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
>> +
>> +    virtio_balloon_poll_free_page_hints(s);
>> +}
>> +
>>   static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>>   {
>>       VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>> @@ -312,6 +395,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>>   
>>       config.num_pages = cpu_to_le32(dev->num_pages);
>>       config.actual = cpu_to_le32(dev->actual);
>> +    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
>>   
>>       trace_virtio_balloon_get_config(config.num_pages, config.actual);
>>       memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
> You need to set config.poison_val as well.
>
>> @@ -365,6 +449,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
>>                           ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT),
>>                           &error_abort);
>>       }
>> +    dev->poison_val = le32_to_cpu(config.poison_val);
> This will have to be migrated.
>
>
>>       trace_virtio_balloon_set_config(dev->actual, oldactual);
>>   }
>>   
>> @@ -374,6 +459,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>>       VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>>       f |= dev->host_features;
>>       virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
>> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);
> Needs a compat property to avoid breaking cross-version migration.
>
>>       return f;
>>   }
>>   
>> @@ -410,6 +496,17 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>>       return 0;
>>   }
>>   
>> +static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
>> +    .name = "virtio-balloon-device/free-page-report",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .needed = virtio_balloon_free_page_support,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>> +        VMSTATE_END_OF_LIST()
>> +    }
>> +};
>> +
>>   static const VMStateDescription vmstate_virtio_balloon_device = {
>>       .name = "virtio-balloon-device",
>>       .version_id = 1,
>> @@ -420,30 +517,29 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
>>           VMSTATE_UINT32(actual, VirtIOBalloon),
>>           VMSTATE_END_OF_LIST()
>>       },
>> +    .subsections = (const VMStateDescription * []) {
>> +        &vmstate_virtio_balloon_free_page_report,
>> +        NULL
>> +    }
>>   };
>>   
>>   static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>>   {
>>       VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>       VirtIOBalloon *s = VIRTIO_BALLOON(dev);
>> -    int ret;
>>   
>>       virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
>>                   sizeof(struct virtio_balloon_config));
>>   
>> -    ret = qemu_add_balloon_handler(virtio_balloon_to_target,
>> -                                   virtio_balloon_stat, s);
>> -
>> -    if (ret < 0) {
>> -        error_setg(errp, "Only one balloon device is supported");
>> -        virtio_cleanup(vdev);
>> -        return;
>> -    }
>> -
>>       s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>>       s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>>       s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
>> -
>> +    if (virtio_has_feature(s->host_features,
>> +                           VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
>> +        s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
>> +        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>> +        s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
>> +    }
>>       reset_stats(s);
>>   }
>>   
>> @@ -472,11 +568,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
>>   {
>>       VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
>>   
>> -    if (!s->stats_vq_elem && vdev->vm_running &&
>> -        (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) {
>> -        /* poll stats queue for the element we have discarded when the VM
>> -         * was stopped */
>> -        virtio_balloon_receive_stats(vdev, s->svq);
>> +    if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>> +        if (!s->stats_vq_elem && vdev->vm_running &&
>> +            virtqueue_rewind(s->svq, 1)) {
>> +            /*
>> +             * Poll stats queue for the element we have discarded when the VM
>> +             * was stopped.
>> +             */
>> +            virtio_balloon_receive_stats(vdev, s->svq);
>> +        }
>> +
>> +        if (virtio_balloon_free_page_support(s)) {
>> +            qemu_add_balloon_handler(virtio_balloon_to_target,
>> +                                     virtio_balloon_stat,
>> +                                     virtio_balloon_free_page_support,
>> +                                     virtio_balloon_free_page_poll,
>> +                                     s);
>> +        } else {
>> +            qemu_add_balloon_handler(virtio_balloon_to_target,
>> +                                     virtio_balloon_stat, NULL, NULL, s);
>> +        }
>>       }
>>   }
>>   
>> @@ -506,6 +617,8 @@ static const VMStateDescription vmstate_virtio_balloon = {
>>   static Property virtio_balloon_properties[] = {
>>       DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
>>                       VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
>> +    DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
>> +                    VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
>>       DEFINE_PROP_END_OF_LIST(),
>>   };
>>   
>> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
>> index 1ea13bd..11b4e01 100644
>> --- a/include/hw/virtio/virtio-balloon.h
>> +++ b/include/hw/virtio/virtio-balloon.h
>> @@ -31,11 +31,20 @@ typedef struct virtio_balloon_stat_modern {
>>          uint64_t val;
>>   } VirtIOBalloonStatModern;
>>   
>> +enum virtio_balloon_free_page_report_status {
>> +    FREE_PAGE_REPORT_S_REQUESTED,
>> +    FREE_PAGE_REPORT_S_START,
>> +    FREE_PAGE_REPORT_S_STOP,
>> +};
>> +
>>   typedef struct VirtIOBalloon {
>>       VirtIODevice parent_obj;
>> -    VirtQueue *ivq, *dvq, *svq;
>> +    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
>> +    uint32_t free_page_report_status;
>>       uint32_t num_pages;
>>       uint32_t actual;
>> +    uint32_t free_page_report_cmd_id;
>> +    uint32_t poison_val;
>>       uint64_t stats[VIRTIO_BALLOON_S_NR];
>>       VirtQueueElement *stats_vq_elem;
>>       size_t stats_vq_offset;
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index 77fd4f5..6df419c 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -14,11 +14,14 @@
>>   #ifndef MIGRATION_MISC_H
>>   #define MIGRATION_MISC_H
>>   
>> +#include "exec/cpu-common.h"
>>   #include "qemu/notify.h"
>>   
>>   /* migration/ram.c */
>>   
>>   void ram_mig_init(void);
>> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
>> +                                       size_t len);
>>   
>>   /* migration/block.c */
>>   
>> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
>> index 9d06ccd..19d0d8b 100644
>> --- a/include/standard-headers/linux/virtio_balloon.h
>> +++ b/include/standard-headers/linux/virtio_balloon.h
>> @@ -34,15 +34,22 @@
>>   #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>>   #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>>   #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
>> +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
>> +#define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
>>   
>>   /* Size of a PFN in the balloon interface. */
>>   #define VIRTIO_BALLOON_PFN_SHIFT 12
>>   
>> +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
>>   struct virtio_balloon_config {
>>   	/* Number of pages host wants Guest to give up. */
>>   	uint32_t num_pages;
>>   	/* Number of pages we've actually got in balloon. */
>>   	uint32_t actual;
>> +	/* Free page report command id, readonly by guest */
>> +	uint32_t free_page_report_cmd_id;
>> +	/* Stores PAGE_POISON if page poisoning is in use */
>> +	uint32_t poison_val;
>>   };
>>   
>>   #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
>> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
>> index af49e19..53db4dc 100644
>> --- a/include/sysemu/balloon.h
>> +++ b/include/sysemu/balloon.h
>> @@ -18,11 +18,19 @@
>>   
>>   typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>>   typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
>> +typedef bool (QEMUBalloonFreePageSupport)(void *opaque);
>> +typedef void (QEMUBalloonFreePagePoll)(void *opaque);
>>   
>> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>> -			     QEMUBalloonStatus *stat_func, void *opaque);
>>   void qemu_remove_balloon_handler(void *opaque);
>>   bool qemu_balloon_is_inhibited(void);
>>   void qemu_balloon_inhibit(bool state);
>> +bool balloon_free_page_support(void);
>> +void balloon_free_page_poll(void);
>> +
>> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
>> +                              QEMUBalloonStatus *stat_fn,
>> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
>> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
>> +                              void *opaque);
>>   
>>   #endif
>> diff --git a/migration/ram.c b/migration/ram.c
>> index cb1950f..d6f462c 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -2186,6 +2186,16 @@ static int ram_init_all(RAMState **rsp)
>>       return 0;
>>   }
>>   
>> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
>> +                                       size_t len)
>> +{
>> +    long start = offset >> TARGET_PAGE_BITS,
>> +         nr = len >> TARGET_PAGE_BITS;
> start is the length calculated from length?
> What does this mean?
>
> I suspect a typo, and this makes me doubt this was tested
> under any kind of stress. When poking at the migration
> stream, you can't just test with a completely idle guest.
>
> After having reviewed 26 versions of the guest support code,
> I admit to a bit of a burnout. How about you work harder
> on isolating migration bits from virtio bits, then get
> migration core devs review these migration parts of
> the patchset?
>
>
>> +
>> +    bitmap_clear(block->bmap, start, nr);
>> +    ram_state->migration_dirty_pages -= nr;
>> +}
>> +
>>   /*
>>    * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>>    * long-running RCU critical section.  When rcu-reclaims in the code
>> -- 
>> 1.8.3.1

Thanks Michael and Dave for the comments. I've made changes based on the 
comments above. Please have a check of the v3 patches that were just 
posted out. Test results based on the latest implementation have also 
been added at the cover letter there.

Best,
Wei

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [virtio-dev] Re: [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
@ 2018-03-02  9:32       ` Wei Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Wang @ 2018-03-02  9:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtio-dev, quintela, dgilbert, pbonzini,
	liliang.opensource, yang.zhang.wz, quan.xu0, nilal, riel

On 02/07/2018 09:04 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 06, 2018 at 07:08:17PM +0800, Wei Wang wrote:
>> The new feature enables the virtio-balloon device to receive the hint of
>> guest free pages from the free page vq, and clears the corresponding bits
>> of the free page from the dirty bitmap, so that those free pages are not
>> transferred by the migration thread.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> Signed-off-by: Liang Li <liang.z.li@intel.com>
>> CC: Michael S. Tsirkin <mst@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> ---
>>   balloon.c                                       |  39 +++++--
>>   hw/virtio/virtio-balloon.c                      | 145 +++++++++++++++++++++---
>>   include/hw/virtio/virtio-balloon.h              |  11 +-
>>   include/migration/misc.h                        |   3 +
>>   include/standard-headers/linux/virtio_balloon.h |   7 ++
>>   include/sysemu/balloon.h                        |  12 +-
>>   migration/ram.c                                 |  10 ++
>>   7 files changed, 198 insertions(+), 29 deletions(-)
>>
>> diff --git a/balloon.c b/balloon.c
>> index 1d720ff..0f0b30c 100644
>> --- a/balloon.c
>> +++ b/balloon.c
>> @@ -36,6 +36,8 @@
>>   
>>   static QEMUBalloonEvent *balloon_event_fn;
>>   static QEMUBalloonStatus *balloon_stat_fn;
>> +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn;
>> +static QEMUBalloonFreePagePoll *balloon_free_page_poll_fn;
>>   static void *balloon_opaque;
>>   static bool balloon_inhibited;
>>   
>> @@ -64,19 +66,34 @@ static bool have_balloon(Error **errp)
>>       return true;
>>   }
>>   
>> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>> -                             QEMUBalloonStatus *stat_func, void *opaque)
>> +bool balloon_free_page_support(void)
>>   {
>> -    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
>> -        /* We're already registered one balloon handler.  How many can
>> -         * a guest really have?
>> -         */
>> -        return -1;
>> +    return balloon_free_page_support_fn &&
>> +           balloon_free_page_support_fn(balloon_opaque);
>> +}
>> +
>> +void balloon_free_page_poll(void)
>> +{
>> +    balloon_free_page_poll_fn(balloon_opaque);
>> +}
>> +
>> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
>> +                              QEMUBalloonStatus *stat_fn,
>> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
>> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
>> +                              void *opaque)
>> +{
>> +    if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn ||
>> +        balloon_free_page_poll_fn || balloon_opaque) {
>> +        /* We already registered one balloon handler. */
>> +        return;
>>       }
>> -    balloon_event_fn = event_func;
>> -    balloon_stat_fn = stat_func;
>> +
>> +    balloon_event_fn = event_fn;
>> +    balloon_stat_fn = stat_fn;
>> +    balloon_free_page_support_fn = free_page_support_fn;
>> +    balloon_free_page_poll_fn = free_page_poll_fn;
>>       balloon_opaque = opaque;
>> -    return 0;
>>   }
>>   
>>   void qemu_remove_balloon_handler(void *opaque)
>> @@ -86,6 +103,8 @@ void qemu_remove_balloon_handler(void *opaque)
>>       }
>>       balloon_event_fn = NULL;
>>       balloon_stat_fn = NULL;
>> +    balloon_free_page_support_fn = NULL;
>> +    balloon_free_page_poll_fn = NULL;
>>       balloon_opaque = NULL;
>>   }
>>   
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 14e08d2..b424d4e 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -23,6 +23,7 @@
>>   #include "hw/virtio/virtio-balloon.h"
>>   #include "sysemu/kvm.h"
>>   #include "exec/address-spaces.h"
>> +#include "exec/ram_addr.h"
>>   #include "qapi/visitor.h"
>>   #include "qapi-event.h"
>>   #include "trace.h"
>> @@ -30,6 +31,7 @@
>>   
>>   #include "hw/virtio/virtio-bus.h"
>>   #include "hw/virtio/virtio-access.h"
>> +#include "migration/misc.h"
>>   
>>   #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>>   
>> @@ -305,6 +307,87 @@ out:
>>       }
>>   }
>>   
>> +static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
>> +{
>> +    VirtQueueElement *elem;
>> +    VirtQueue *vq = dev->free_page_vq;
>> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>> +    bool page_poisoning = virtio_vdev_has_feature(vdev,
>> +                                              VIRTIO_BALLOON_F_PAGE_POISON);
>> +    uint32_t id;
>> +
>> +    /* Poll the vq till a stop cmd id is received */
>> +    while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
>> +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>> +        if (!elem) {
>> +            continue;
>> +        }
>> +
>> +        if (elem->out_num) {
>> +            iov_to_buf(elem->out_sg, elem->out_num, 0, &id, sizeof(uint32_t));
>> +            virtqueue_push(vq, elem, sizeof(id));
>> +            g_free(elem);
>> +            if (id == dev->free_page_report_cmd_id) {
>> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
>> +            } else {
>> +                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>
> So if migration on source times out, then you migrate to
> another destination, this will cancel the in-progress reporting
> due to while loop above.  Probably not what was intended.
>
>> +                break;
>> +            }
>> +        }
>> +
>> +        if (elem->in_num) {
>> +            RAMBlock *block;
>> +            ram_addr_t offset;
>> +            void *base;
>> +            size_t total_len, len;
>> +
>> +            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START &&
>> +                !page_poisoning) {
> So when poisoning is enabled, you can't skip the pages?
>
> I suspect if poison is 0 you actually can.
>
>
>
>> +                base = elem->in_sg[0].iov_base;
>> +                total_len = elem->in_sg[0].iov_len;
>> +                len = total_len;
> the below list looks like it'd be a better API.
> How about an API that just gives hints to qemu?
> qemu_guest_page_free_hint(start, len)?
>
>> +                while (total_len > 0) {
>> +                    block = qemu_ram_block_from_host(base, false, &offset);
>> +                    if (unlikely(offset + total_len > block->used_length)) {
>> +                        len = block->used_length - offset;
>> +                        base += len;
>> +                    }
>> +                    skip_free_pages_from_dirty_bitmap(block, offset, len);
> For post-copy migration, this does not look like it will DTRT.
>
> Doesn't look like it will DTRT for tcg either.
>
> And generally, this only works as long as you don't call log_sync.
>
> Once you call log_sync, you can't rely on old hints, and you must increment
> command id.
>
> which isn't a good API, in my opinion.
>
>> +                    total_len -= len;
>> +                }
>> +            }
>> +            virtqueue_push(vq, elem, total_len);
>> +            g_free(elem);
>> +        }
>> +    }
>> +}
>> +
>> +static bool virtio_balloon_free_page_support(void *opaque)
>> +{
>> +    VirtIOBalloon *s = opaque;
>> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
>> +
>> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
>> +}
>> +
>> +static void virtio_balloon_free_page_poll(void *opaque)
>> +{
>> +    VirtIOBalloon *s = opaque;
>> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
>> +
>> +    if (unlikely(s->free_page_report_cmd_id == UINT_MAX)) {
>> +        s->free_page_report_cmd_id = 1;
>> +    } else {
>> +        s->free_page_report_cmd_id++;
>> +    }
> I think it is prudent to reserve a range of command IDs that we don't use.
> These can then be used later for more commands.
>
> E.g. how about VIRTIO_BALLOON_FREE_PAGE_REPORT_MIN_ID 0x80000000
> and let the ID go between 0x80000000 and UINT_MAX?
>
> No guest changes needed, but patching uapi in linux to add the enum
> might not be a bad idea.
>
>
>> +
>> +    virtio_notify_config(vdev);
>> +    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
>> +
>> +    virtio_balloon_poll_free_page_hints(s);
>> +}
>> +
>>   static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>>   {
>>       VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>> @@ -312,6 +395,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>>   
>>       config.num_pages = cpu_to_le32(dev->num_pages);
>>       config.actual = cpu_to_le32(dev->actual);
>> +    config.free_page_report_cmd_id = cpu_to_le32(dev->free_page_report_cmd_id);
>>   
>>       trace_virtio_balloon_get_config(config.num_pages, config.actual);
>>       memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
> You need to set config.poison_val as well.
>
>> @@ -365,6 +449,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
>>                           ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT),
>>                           &error_abort);
>>       }
>> +    dev->poison_val = le32_to_cpu(config.poison_val);
> This will have to be migrated.
>
>
>>       trace_virtio_balloon_set_config(dev->actual, oldactual);
>>   }
>>   
>> @@ -374,6 +459,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>>       VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>>       f |= dev->host_features;
>>       virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
>> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);
> Needs a compat property to avoid breaking cross-version migration.
>
>>       return f;
>>   }
>>   
>> @@ -410,6 +496,17 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>>       return 0;
>>   }
>>   
>> +static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
>> +    .name = "virtio-balloon-device/free-page-report",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .needed = virtio_balloon_free_page_support,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>> +        VMSTATE_END_OF_LIST()
>> +    }
>> +};
>> +
>>   static const VMStateDescription vmstate_virtio_balloon_device = {
>>       .name = "virtio-balloon-device",
>>       .version_id = 1,
>> @@ -420,30 +517,29 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
>>           VMSTATE_UINT32(actual, VirtIOBalloon),
>>           VMSTATE_END_OF_LIST()
>>       },
>> +    .subsections = (const VMStateDescription * []) {
>> +        &vmstate_virtio_balloon_free_page_report,
>> +        NULL
>> +    }
>>   };
>>   
>>   static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>>   {
>>       VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>       VirtIOBalloon *s = VIRTIO_BALLOON(dev);
>> -    int ret;
>>   
>>       virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
>>                   sizeof(struct virtio_balloon_config));
>>   
>> -    ret = qemu_add_balloon_handler(virtio_balloon_to_target,
>> -                                   virtio_balloon_stat, s);
>> -
>> -    if (ret < 0) {
>> -        error_setg(errp, "Only one balloon device is supported");
>> -        virtio_cleanup(vdev);
>> -        return;
>> -    }
>> -
>>       s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>>       s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>>       s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
>> -
>> +    if (virtio_has_feature(s->host_features,
>> +                           VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
>> +        s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, NULL);
>> +        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>> +        s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
>> +    }
>>       reset_stats(s);
>>   }
>>   
>> @@ -472,11 +568,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
>>   {
>>       VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
>>   
>> -    if (!s->stats_vq_elem && vdev->vm_running &&
>> -        (status & VIRTIO_CONFIG_S_DRIVER_OK) && virtqueue_rewind(s->svq, 1)) {
>> -        /* poll stats queue for the element we have discarded when the VM
>> -         * was stopped */
>> -        virtio_balloon_receive_stats(vdev, s->svq);
>> +    if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
>> +        if (!s->stats_vq_elem && vdev->vm_running &&
>> +            virtqueue_rewind(s->svq, 1)) {
>> +            /*
>> +             * Poll stats queue for the element we have discarded when the VM
>> +             * was stopped.
>> +             */
>> +            virtio_balloon_receive_stats(vdev, s->svq);
>> +        }
>> +
>> +        if (virtio_balloon_free_page_support(s)) {
>> +            qemu_add_balloon_handler(virtio_balloon_to_target,
>> +                                     virtio_balloon_stat,
>> +                                     virtio_balloon_free_page_support,
>> +                                     virtio_balloon_free_page_poll,
>> +                                     s);
>> +        } else {
>> +            qemu_add_balloon_handler(virtio_balloon_to_target,
>> +                                     virtio_balloon_stat, NULL, NULL, s);
>> +        }
>>       }
>>   }
>>   
>> @@ -506,6 +617,8 @@ static const VMStateDescription vmstate_virtio_balloon = {
>>   static Property virtio_balloon_properties[] = {
>>       DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
>>                       VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
>> +    DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
>> +                    VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
>>       DEFINE_PROP_END_OF_LIST(),
>>   };
>>   
>> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
>> index 1ea13bd..11b4e01 100644
>> --- a/include/hw/virtio/virtio-balloon.h
>> +++ b/include/hw/virtio/virtio-balloon.h
>> @@ -31,11 +31,20 @@ typedef struct virtio_balloon_stat_modern {
>>          uint64_t val;
>>   } VirtIOBalloonStatModern;
>>   
>> +enum virtio_balloon_free_page_report_status {
>> +    FREE_PAGE_REPORT_S_REQUESTED,
>> +    FREE_PAGE_REPORT_S_START,
>> +    FREE_PAGE_REPORT_S_STOP,
>> +};
>> +
>>   typedef struct VirtIOBalloon {
>>       VirtIODevice parent_obj;
>> -    VirtQueue *ivq, *dvq, *svq;
>> +    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
>> +    uint32_t free_page_report_status;
>>       uint32_t num_pages;
>>       uint32_t actual;
>> +    uint32_t free_page_report_cmd_id;
>> +    uint32_t poison_val;
>>       uint64_t stats[VIRTIO_BALLOON_S_NR];
>>       VirtQueueElement *stats_vq_elem;
>>       size_t stats_vq_offset;
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index 77fd4f5..6df419c 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -14,11 +14,14 @@
>>   #ifndef MIGRATION_MISC_H
>>   #define MIGRATION_MISC_H
>>   
>> +#include "exec/cpu-common.h"
>>   #include "qemu/notify.h"
>>   
>>   /* migration/ram.c */
>>   
>>   void ram_mig_init(void);
>> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
>> +                                       size_t len);
>>   
>>   /* migration/block.c */
>>   
>> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
>> index 9d06ccd..19d0d8b 100644
>> --- a/include/standard-headers/linux/virtio_balloon.h
>> +++ b/include/standard-headers/linux/virtio_balloon.h
>> @@ -34,15 +34,22 @@
>>   #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>>   #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>>   #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
>> +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
>> +#define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
>>   
>>   /* Size of a PFN in the balloon interface. */
>>   #define VIRTIO_BALLOON_PFN_SHIFT 12
>>   
>> +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
>>   struct virtio_balloon_config {
>>   	/* Number of pages host wants Guest to give up. */
>>   	uint32_t num_pages;
>>   	/* Number of pages we've actually got in balloon. */
>>   	uint32_t actual;
>> +	/* Free page report command id, readonly by guest */
>> +	uint32_t free_page_report_cmd_id;
>> +	/* Stores PAGE_POISON if page poisoning is in use */
>> +	uint32_t poison_val;
>>   };
>>   
>>   #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
>> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
>> index af49e19..53db4dc 100644
>> --- a/include/sysemu/balloon.h
>> +++ b/include/sysemu/balloon.h
>> @@ -18,11 +18,19 @@
>>   
>>   typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>>   typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
>> +typedef bool (QEMUBalloonFreePageSupport)(void *opaque);
>> +typedef void (QEMUBalloonFreePagePoll)(void *opaque);
>>   
>> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>> -			     QEMUBalloonStatus *stat_func, void *opaque);
>>   void qemu_remove_balloon_handler(void *opaque);
>>   bool qemu_balloon_is_inhibited(void);
>>   void qemu_balloon_inhibit(bool state);
>> +bool balloon_free_page_support(void);
>> +void balloon_free_page_poll(void);
>> +
>> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
>> +                              QEMUBalloonStatus *stat_fn,
>> +                              QEMUBalloonFreePageSupport *free_page_support_fn,
>> +                              QEMUBalloonFreePagePoll *free_page_poll_fn,
>> +                              void *opaque);
>>   
>>   #endif
>> diff --git a/migration/ram.c b/migration/ram.c
>> index cb1950f..d6f462c 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -2186,6 +2186,16 @@ static int ram_init_all(RAMState **rsp)
>>       return 0;
>>   }
>>   
>> +void skip_free_pages_from_dirty_bitmap(RAMBlock *block, ram_addr_t offset,
>> +                                       size_t len)
>> +{
>> +    long start = offset >> TARGET_PAGE_BITS,
>> +         nr = len >> TARGET_PAGE_BITS;
> start is the length calculated from length?
> What does this mean?
>
> I suspect a typo, and this makes me doubt this was tested
> under any kind of stress. When poking at the migration
> stream, you can't just test with a completely idle guest.
>
> After having reviewed 26 versions of the guest support code,
> I admit to a bit of a burnout. How about you work harder
> on isolating migration bits from virtio bits, then get
> migration core devs review these migration parts of
> the patchset?
>
>
>> +
>> +    bitmap_clear(block->bmap, start, nr);
>> +    ram_state->migration_dirty_pages -= nr;
>> +}
>> +
>>   /*
>>    * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>>    * long-running RCU critical section.  When rcu-reclaims in the code
>> -- 
>> 1.8.3.1

Thanks Michael and Dave for the comments. I've made changes based on the 
comments above. Please have a check of the v3 patches that were just 
posted out. Test results based on the latest implementation have also 
been added at the cover letter there.

Best,
Wei



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2018-03-02  9:29 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-06 11:08 [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support Wei Wang
2018-02-06 11:08 ` [virtio-dev] " Wei Wang
2018-02-06 11:08 ` [Qemu-devel] [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
2018-02-06 11:08   ` [virtio-dev] " Wei Wang
2018-02-07  1:04   ` [Qemu-devel] " Michael S. Tsirkin
2018-02-07  1:04     ` [virtio-dev] " Michael S. Tsirkin
2018-03-02  9:32     ` [Qemu-devel] " Wei Wang
2018-03-02  9:32       ` [virtio-dev] " Wei Wang
2018-02-09 12:06   ` [Qemu-devel] " Dr. David Alan Gilbert
2018-02-06 11:08 ` [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon Wei Wang
2018-02-06 11:08   ` [virtio-dev] " Wei Wang
2018-02-06 23:57   ` [Qemu-devel] " Michael S. Tsirkin
2018-02-06 23:57     ` [virtio-dev] " Michael S. Tsirkin
2018-02-08  3:54     ` [Qemu-devel] " Wei Wang
2018-02-08  3:54       ` [virtio-dev] " Wei Wang
2018-02-09 11:50   ` [Qemu-devel] " Dr. David Alan Gilbert
2018-02-26  5:07     ` Wei Wang
2018-02-26  9:22       ` Wang, Wei W
2018-02-06 11:08 ` [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time Wei Wang
2018-02-06 11:08   ` [virtio-dev] " Wei Wang
2018-02-06 23:43   ` [Qemu-devel] " Michael S. Tsirkin
2018-02-06 23:43     ` [virtio-dev] " Michael S. Tsirkin
2018-02-09 12:15   ` [Qemu-devel] " Dr. David Alan Gilbert
2018-02-26  4:35     ` Wei Wang
2018-02-27  0:50       ` Michael S. Tsirkin
2018-02-27 10:10         ` Wei Wang
2018-02-27 13:08           ` Liang Li
2018-02-28 10:33             ` Wei Wang
2018-02-27 10:34       ` Dr. David Alan Gilbert
2018-02-28 10:37         ` Wei Wang
2018-02-07  0:02 ` [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support Michael S. Tsirkin
2018-02-07  0:02   ` [virtio-dev] " Michael S. Tsirkin
2018-02-08  5:38   ` [Qemu-devel] " Wei Wang
2018-02-08  5:38     ` [virtio-dev] " Wei Wang
2018-02-08 20:15 ` [Qemu-devel] " Dr. David Alan Gilbert
2018-02-09  3:10   ` Wei Wang
2018-02-09  3:10     ` [virtio-dev] " Wei Wang
2018-02-09 10:53     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-02-26  4:42       ` Wei Wang
2018-02-26  4:42         ` [virtio-dev] " Wei Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.