All of lore.kernel.org
 help / color / mirror / Atom feed
* [QEMU 0/7] Fast balloon and fast live migration
@ 2016-06-13 10:16 ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

This patch set is intended to speed up the inflating/deflating
process of virtio-balloon and speed up live migration by skipping
process guest's free pages.

The virtio-balloon is extended to support some new features, so
as to make things faster.

Liang Li (7):
  balloon: speed up inflating & deflating process
  virtio-balloon: add drop cache support
  Add the hmp and qmp interface for dropping cache
  balloon: get free page info from guest
  bitmap: Add a new bitmap_move function
  kvm: Add two new arch specific functions
  migration: skip free pages during live migration

 balloon.c                                       |  51 +++-
 hmp-commands.hx                                 |  15 ++
 hmp.c                                           |  22 ++
 hmp.h                                           |   3 +
 hw/virtio/virtio-balloon.c                      | 315 ++++++++++++++++++++++--
 include/hw/virtio/virtio-balloon.h              |  23 +-
 include/qemu/bitmap.h                           |  13 +
 include/standard-headers/linux/virtio_balloon.h |   2 +
 include/sysemu/balloon.h                        |  13 +-
 include/sysemu/kvm.h                            |   2 +
 migration/ram.c                                 |  93 +++++++
 monitor.c                                       |  18 ++
 qapi-schema.json                                |  35 +++
 qmp-commands.hx                                 |  23 ++
 target-arm/kvm.c                                |  14 ++
 target-i386/kvm.c                               |  35 +++
 target-mips/kvm.c                               |  14 ++
 target-ppc/kvm.c                                |  14 ++
 target-s390x/kvm.c                              |  14 ++
 19 files changed, 693 insertions(+), 26 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 0/7] Fast balloon and fast live migration
@ 2016-06-13 10:16 ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

This patch set is intended to speed up the inflating/deflating
process of virtio-balloon and speed up live migration by skipping
process guest's free pages.

The virtio-balloon is extended to support some new features, so
as to make things faster.

Liang Li (7):
  balloon: speed up inflating & deflating process
  virtio-balloon: add drop cache support
  Add the hmp and qmp interface for dropping cache
  balloon: get free page info from guest
  bitmap: Add a new bitmap_move function
  kvm: Add two new arch specific functions
  migration: skip free pages during live migration

 balloon.c                                       |  51 +++-
 hmp-commands.hx                                 |  15 ++
 hmp.c                                           |  22 ++
 hmp.h                                           |   3 +
 hw/virtio/virtio-balloon.c                      | 315 ++++++++++++++++++++++--
 include/hw/virtio/virtio-balloon.h              |  23 +-
 include/qemu/bitmap.h                           |  13 +
 include/standard-headers/linux/virtio_balloon.h |   2 +
 include/sysemu/balloon.h                        |  13 +-
 include/sysemu/kvm.h                            |   2 +
 migration/ram.c                                 |  93 +++++++
 monitor.c                                       |  18 ++
 qapi-schema.json                                |  35 +++
 qmp-commands.hx                                 |  23 ++
 target-arm/kvm.c                                |  14 ++
 target-i386/kvm.c                               |  35 +++
 target-mips/kvm.c                               |  14 ++
 target-ppc/kvm.c                                |  14 ++
 target-s390x/kvm.c                              |  14 ++
 19 files changed, 693 insertions(+), 26 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-13 10:16 ` [Qemu-devel] " Liang Li
@ 2016-06-13 10:16   ` Liang Li
  -1 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

The implementation of the current virtio-balloon is not very efficient,
Bellow is test result of time spends on inflating the balloon to 3GB of
a 4GB idle guest:

a. allocating pages (6.5%, 103ms)
b. sending PFNs to host (68.3%, 787ms)
c. address translation (6.1%, 96ms)
d. madvise (19%, 300ms)

It takes about 1577ms for the whole inflating process to complete. The
test shows that the bottle neck is the stage b and stage d.

If using a bitmap to send the page info instead of the PFNs, we can
reduce the overhead spends on stage b quite a lot. Furthermore, it's
possible to do the address translation and do the madvise with a bulk
of pages, instead of the current page per page way, so the overhead of
stage c and stage d can also be reduced a lot.

This patch is the QEMU side implementation which is intended to speed
up the inflating & deflating process by adding a new feature to the
virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
idle guest only takes 210ms, it's about 8 times as fast as before.

TODO: optimize stage a by allocating/freeing a chunk of pages instead
of a single page at a time.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
 include/standard-headers/linux/virtio_balloon.h |   1 +
 2 files changed, 139 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 8c15e09..8cf74c2 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
 #endif
 }
 
+static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
+                                  unsigned long len, bool deflate)
+{
+    ram_addr_t size, processed, chunk, base;
+    void *addr;
+    MemoryRegionSection section = {.mr = NULL};
+
+    size = (len << page_shift);
+    base = (base_pfn << page_shift);
+
+    for (processed = 0; processed < size; processed += chunk) {
+        chunk = size - processed;
+        while (chunk >= TARGET_PAGE_SIZE) {
+            section = memory_region_find(get_system_memory(),
+                                         base + processed, chunk);
+            if (!section.mr) {
+                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
+            } else {
+                break;
+            }
+        }
+
+        if (section.mr &&
+            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
+            addr = section.offset_within_region +
+                   memory_region_get_ram_ptr(section.mr);
+            qemu_madvise(addr, chunk,
+                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
+        } else {
+            fprintf(stderr, "can't find the chunk, skip\n");
+            chunk = TARGET_PAGE_SIZE;
+        }
+    }
+}
+
+static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
+                               unsigned long len, int page_shift, bool deflate)
+{
+#if defined(__linux__)
+    unsigned long end  = len * 8;
+    unsigned long current = 0;
+
+    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
+                                         kvm_has_sync_mmu())) {
+        while (current < end) {
+            unsigned long one = find_next_bit(bitmap, end, current);
+
+            if (one < end) {
+                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
+                unsigned long page_length;
+
+                if (zero >= end) {
+                    page_length = end - one;
+                } else {
+                    page_length = zero - one;
+                }
+
+                if (page_length) {
+                    do_balloon_bulk_pages(base_pfn + one, page_shift,
+                                          page_length, deflate);
+                }
+                current = one + page_length;
+            } else {
+                current = one;
+            }
+        }
+    }
+#endif
+}
+
 static const char *balloon_stat_names[] = {
    [VIRTIO_BALLOON_S_SWAP_IN] = "stat-swap-in",
    [VIRTIO_BALLOON_S_SWAP_OUT] = "stat-swap-out",
@@ -78,6 +148,12 @@ static bool balloon_stats_supported(const VirtIOBalloon *s)
     return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ);
 }
 
+static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
+}
+
 static bool balloon_stats_enabled(const VirtIOBalloon *s)
 {
     return s->stats_poll_interval > 0;
@@ -224,27 +300,66 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
             return;
         }
 
-        while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
-            ram_addr_t pa;
-            ram_addr_t addr;
-            int p = virtio_ldl_p(vdev, &pfn);
-
-            pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
-            offset += 4;
-
-            /* FIXME: remove get_system_memory(), but how? */
-            section = memory_region_find(get_system_memory(), pa, 1);
-            if (!int128_nz(section.size) || !memory_region_is_ram(section.mr))
-                continue;
-
-            trace_virtio_balloon_handle_output(memory_region_name(section.mr),
-                                               pa);
-            /* Using memory_region_get_ram_ptr is bending the rules a bit, but
-               should be OK because we only want a single page.  */
-            addr = section.offset_within_region;
-            balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
-                         !!(vq == s->dvq));
-            memory_region_unref(section.mr);
+        if (balloon_page_bitmap_supported(s)) {
+            uint64_t base_pfn, tmp64, bmap_len;
+            uint32_t tmp32, page_shift, id;
+            unsigned long *bitmap;
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp32, sizeof(uint32_t));
+            id = virtio_ldl_p(vdev, &tmp32);
+            offset += sizeof(uint32_t);
+            /* to suppress build warning */
+            id = id;
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp32, sizeof(uint32_t));
+            page_shift = virtio_ldl_p(vdev, &tmp32);
+            offset += sizeof(uint32_t);
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp64, sizeof(uint64_t));
+            base_pfn = virtio_ldq_p(vdev, &tmp64);
+            offset += sizeof(uint64_t);
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp64, sizeof(uint64_t));
+            bmap_len = virtio_ldq_p(vdev, &tmp64);
+            offset += sizeof(uint64_t);
+
+            bitmap = bitmap_new(bmap_len * BITS_PER_BYTE);
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       bitmap, bmap_len);
+            offset += bmap_len;
+
+            balloon_bulk_pages(base_pfn, bitmap, bmap_len,
+                               page_shift, !!(vq == s->dvq));
+            g_free(bitmap);
+        } else {
+            while (iov_to_buf(elem->out_sg, elem->out_num, offset,
+                              &pfn, 4) == 4) {
+                ram_addr_t pa;
+                ram_addr_t addr;
+                int p = virtio_ldl_p(vdev, &pfn);
+
+                pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
+                offset += 4;
+
+                /* FIXME: remove get_system_memory(), but how? */
+                section = memory_region_find(get_system_memory(), pa, 1);
+                if (!int128_nz(section.size) ||
+                    !memory_region_is_ram(section.mr))
+                    continue;
+
+                trace_virtio_balloon_handle_output(memory_region_name(
+                                                            section.mr), pa);
+                /* Using memory_region_get_ram_ptr is bending the rules a bit,
+                 * but should be OK because we only want a single page.  */
+                addr = section.offset_within_region;
+                balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
+                             !!(vq == s->dvq));
+                memory_region_unref(section.mr);
+            }
         }
 
         virtqueue_push(vq, elem, offset);
@@ -374,6 +489,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
     f |= dev->host_features;
     virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
+    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
     return f;
 }
 
@@ -388,6 +504,7 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+
     ram_addr_t vm_ram_size = get_current_ram_size();
 
     if (target > vm_ram_size) {
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 9d06ccd..7c9686c 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -34,6 +34,7 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-13 10:16   ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

The implementation of the current virtio-balloon is not very efficient,
Bellow is test result of time spends on inflating the balloon to 3GB of
a 4GB idle guest:

a. allocating pages (6.5%, 103ms)
b. sending PFNs to host (68.3%, 787ms)
c. address translation (6.1%, 96ms)
d. madvise (19%, 300ms)

It takes about 1577ms for the whole inflating process to complete. The
test shows that the bottle neck is the stage b and stage d.

If using a bitmap to send the page info instead of the PFNs, we can
reduce the overhead spends on stage b quite a lot. Furthermore, it's
possible to do the address translation and do the madvise with a bulk
of pages, instead of the current page per page way, so the overhead of
stage c and stage d can also be reduced a lot.

This patch is the QEMU side implementation which is intended to speed
up the inflating & deflating process by adding a new feature to the
virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
idle guest only takes 210ms, it's about 8 times as fast as before.

TODO: optimize stage a by allocating/freeing a chunk of pages instead
of a single page at a time.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
 include/standard-headers/linux/virtio_balloon.h |   1 +
 2 files changed, 139 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 8c15e09..8cf74c2 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
 #endif
 }
 
+static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
+                                  unsigned long len, bool deflate)
+{
+    ram_addr_t size, processed, chunk, base;
+    void *addr;
+    MemoryRegionSection section = {.mr = NULL};
+
+    size = (len << page_shift);
+    base = (base_pfn << page_shift);
+
+    for (processed = 0; processed < size; processed += chunk) {
+        chunk = size - processed;
+        while (chunk >= TARGET_PAGE_SIZE) {
+            section = memory_region_find(get_system_memory(),
+                                         base + processed, chunk);
+            if (!section.mr) {
+                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
+            } else {
+                break;
+            }
+        }
+
+        if (section.mr &&
+            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
+            addr = section.offset_within_region +
+                   memory_region_get_ram_ptr(section.mr);
+            qemu_madvise(addr, chunk,
+                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
+        } else {
+            fprintf(stderr, "can't find the chunk, skip\n");
+            chunk = TARGET_PAGE_SIZE;
+        }
+    }
+}
+
+static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
+                               unsigned long len, int page_shift, bool deflate)
+{
+#if defined(__linux__)
+    unsigned long end  = len * 8;
+    unsigned long current = 0;
+
+    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
+                                         kvm_has_sync_mmu())) {
+        while (current < end) {
+            unsigned long one = find_next_bit(bitmap, end, current);
+
+            if (one < end) {
+                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
+                unsigned long page_length;
+
+                if (zero >= end) {
+                    page_length = end - one;
+                } else {
+                    page_length = zero - one;
+                }
+
+                if (page_length) {
+                    do_balloon_bulk_pages(base_pfn + one, page_shift,
+                                          page_length, deflate);
+                }
+                current = one + page_length;
+            } else {
+                current = one;
+            }
+        }
+    }
+#endif
+}
+
 static const char *balloon_stat_names[] = {
    [VIRTIO_BALLOON_S_SWAP_IN] = "stat-swap-in",
    [VIRTIO_BALLOON_S_SWAP_OUT] = "stat-swap-out",
@@ -78,6 +148,12 @@ static bool balloon_stats_supported(const VirtIOBalloon *s)
     return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ);
 }
 
+static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
+}
+
 static bool balloon_stats_enabled(const VirtIOBalloon *s)
 {
     return s->stats_poll_interval > 0;
@@ -224,27 +300,66 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
             return;
         }
 
-        while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
-            ram_addr_t pa;
-            ram_addr_t addr;
-            int p = virtio_ldl_p(vdev, &pfn);
-
-            pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
-            offset += 4;
-
-            /* FIXME: remove get_system_memory(), but how? */
-            section = memory_region_find(get_system_memory(), pa, 1);
-            if (!int128_nz(section.size) || !memory_region_is_ram(section.mr))
-                continue;
-
-            trace_virtio_balloon_handle_output(memory_region_name(section.mr),
-                                               pa);
-            /* Using memory_region_get_ram_ptr is bending the rules a bit, but
-               should be OK because we only want a single page.  */
-            addr = section.offset_within_region;
-            balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
-                         !!(vq == s->dvq));
-            memory_region_unref(section.mr);
+        if (balloon_page_bitmap_supported(s)) {
+            uint64_t base_pfn, tmp64, bmap_len;
+            uint32_t tmp32, page_shift, id;
+            unsigned long *bitmap;
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp32, sizeof(uint32_t));
+            id = virtio_ldl_p(vdev, &tmp32);
+            offset += sizeof(uint32_t);
+            /* to suppress build warning */
+            id = id;
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp32, sizeof(uint32_t));
+            page_shift = virtio_ldl_p(vdev, &tmp32);
+            offset += sizeof(uint32_t);
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp64, sizeof(uint64_t));
+            base_pfn = virtio_ldq_p(vdev, &tmp64);
+            offset += sizeof(uint64_t);
+
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       &tmp64, sizeof(uint64_t));
+            bmap_len = virtio_ldq_p(vdev, &tmp64);
+            offset += sizeof(uint64_t);
+
+            bitmap = bitmap_new(bmap_len * BITS_PER_BYTE);
+            iov_to_buf(elem->out_sg, elem->out_num, offset,
+                       bitmap, bmap_len);
+            offset += bmap_len;
+
+            balloon_bulk_pages(base_pfn, bitmap, bmap_len,
+                               page_shift, !!(vq == s->dvq));
+            g_free(bitmap);
+        } else {
+            while (iov_to_buf(elem->out_sg, elem->out_num, offset,
+                              &pfn, 4) == 4) {
+                ram_addr_t pa;
+                ram_addr_t addr;
+                int p = virtio_ldl_p(vdev, &pfn);
+
+                pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
+                offset += 4;
+
+                /* FIXME: remove get_system_memory(), but how? */
+                section = memory_region_find(get_system_memory(), pa, 1);
+                if (!int128_nz(section.size) ||
+                    !memory_region_is_ram(section.mr))
+                    continue;
+
+                trace_virtio_balloon_handle_output(memory_region_name(
+                                                            section.mr), pa);
+                /* Using memory_region_get_ram_ptr is bending the rules a bit,
+                 * but should be OK because we only want a single page.  */
+                addr = section.offset_within_region;
+                balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
+                             !!(vq == s->dvq));
+                memory_region_unref(section.mr);
+            }
         }
 
         virtqueue_push(vq, elem, offset);
@@ -374,6 +489,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
     f |= dev->host_features;
     virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
+    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
     return f;
 }
 
@@ -388,6 +504,7 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+
     ram_addr_t vm_ram_size = get_current_ram_size();
 
     if (target > vm_ram_size) {
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 9d06ccd..7c9686c 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -34,6 +34,7 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [QEMU 2/7] virtio-balloon: add drop cache support
  2016-06-13 10:16 ` [Qemu-devel] " Liang Li
@ 2016-06-13 10:16   ` Liang Li
  -1 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

virtio-balloon can make use of the amount of free memory to determine
the amount of memory to be filled in the balloon, but the amount of
free memory will be effected by the page cache, which can be reclaimed.
Drop the cache before getting the amount of free memory will be very
helpful to relect the exact amount of memroy that can be reclaimed.

This patch add a new feature to the balloon device to support this
operation, hypervisor can request the VM to drop it's cache, so as to
reclaim more memory.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 balloon.c                                       | 10 ++-
 hw/virtio/virtio-balloon.c                      | 85 ++++++++++++++++++++++++-
 include/hw/virtio/virtio-balloon.h              | 19 +++++-
 include/standard-headers/linux/virtio_balloon.h |  1 +
 include/sysemu/balloon.h                        |  5 +-
 5 files changed, 115 insertions(+), 5 deletions(-)

diff --git a/balloon.c b/balloon.c
index f2ef50c..0fb34bf 100644
--- a/balloon.c
+++ b/balloon.c
@@ -36,6 +36,7 @@
 
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
+static QEMUBalloonDropCache *balloon_drop_cache_fn;
 static void *balloon_opaque;
 static bool balloon_inhibited;
 
@@ -65,9 +66,12 @@ static bool have_balloon(Error **errp)
 }
 
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-                             QEMUBalloonStatus *stat_func, void *opaque)
+                             QEMUBalloonStatus *stat_func,
+                             QEMUBalloonDropCache *drop_cache_func,
+                             void *opaque)
 {
-    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
+    if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
+        || balloon_opaque) {
         /* We're already registered one balloon handler.  How many can
          * a guest really have?
          */
@@ -75,6 +79,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
     }
     balloon_event_fn = event_func;
     balloon_stat_fn = stat_func;
+    balloon_drop_cache_fn = drop_cache_func;
     balloon_opaque = opaque;
     return 0;
 }
@@ -86,6 +91,7 @@ void qemu_remove_balloon_handler(void *opaque)
     }
     balloon_event_fn = NULL;
     balloon_stat_fn = NULL;
+    balloon_drop_cache_fn = NULL;
     balloon_opaque = NULL;
 }
 
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 8cf74c2..4757ba5 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -36,6 +36,10 @@
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
+enum balloon_req_id {
+       BALLOON_DROP_CACHE,
+};
+
 static void balloon_page(void *addr, int deflate)
 {
 #if defined(__linux__)
@@ -154,6 +158,12 @@ static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
     return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
 }
 
+static bool balloon_misc_supported(const VirtIOBalloon *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_MISC);
+}
+
 static bool balloon_stats_enabled(const VirtIOBalloon *s)
 {
     return s->stats_poll_interval > 0;
@@ -420,6 +430,39 @@ out:
     }
 }
 
+static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+    VirtQueueElement *elem;
+    size_t offset = 0;
+    uint32_t tmp32, id = 0;
+
+    elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+    if (!elem) {
+        s->req_status = REQ_ERROR;
+        return;
+    }
+
+    s->misc_vq_elem = elem;
+
+    if (!elem->out_num) {
+        return;
+    }
+
+    iov_to_buf(elem->out_sg, elem->out_num, offset,
+               &tmp32, sizeof(uint32_t));
+    id = virtio_ldl_p(vdev, &tmp32);
+    offset += sizeof(uint32_t);
+    switch (id) {
+    case BALLOON_DROP_CACHE:
+        s->req_status = REQ_DONE;
+        break;
+    default:
+        break;
+    }
+
+}
+
 static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
@@ -490,6 +533,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
     f |= dev->host_features;
     virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
     virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
+    virtio_add_feature(&f, VIRTIO_BALLOON_F_MISC);
     return f;
 }
 
@@ -500,6 +544,36 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
                                              VIRTIO_BALLOON_PFN_SHIFT);
 }
 
+static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    VirtQueueElement *elem = s->misc_vq_elem;
+    int len;
+
+    if (!balloon_misc_supported(s)) {
+        return REQ_UNSUPPORT;
+    }
+
+    if (elem == NULL || !elem->in_num) {
+        elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
+        if (!elem) {
+            return REQ_ERROR;
+        }
+        s->misc_vq_elem = elem;
+    }
+    s->misc_req.id = BALLOON_DROP_CACHE;
+    s->misc_req.param = type;
+    len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
+                       sizeof(s->misc_req));
+    virtqueue_push(s->mvq, elem, len);
+    virtio_notify(vdev, s->mvq);
+    g_free(s->misc_vq_elem);
+    s->misc_vq_elem = NULL;
+
+    return REQ_DONE;
+}
+
 static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
@@ -562,7 +636,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
                 sizeof(struct virtio_balloon_config));
 
     ret = qemu_add_balloon_handler(virtio_balloon_to_target,
-                                   virtio_balloon_stat, s);
+                                   virtio_balloon_stat,
+                                   virtio_balloon_drop_cache, s);
 
     if (ret < 0) {
         error_setg(errp, "Only one balloon device is supported");
@@ -573,8 +648,10 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
     s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
+    s->mvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_resp);
 
     reset_stats(s);
+    s->req_status = REQ_INIT;
 
     register_savevm(dev, "virtio-balloon", -1, 1,
                     virtio_balloon_save, virtio_balloon_load, s);
@@ -599,6 +676,12 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
         g_free(s->stats_vq_elem);
         s->stats_vq_elem = NULL;
     }
+
+    if (s->misc_vq_elem != NULL) {
+        g_free(s->misc_vq_elem);
+        s->misc_vq_elem = NULL;
+    }
+    s->req_status = REQ_INIT;
 }
 
 static void virtio_balloon_instance_init(Object *obj)
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 35f62ac..a21bb45 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -23,6 +23,20 @@
 #define VIRTIO_BALLOON(obj) \
         OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
 
+typedef enum {
+    REQ_INIT,
+    REQ_ON_GOING,
+    REQ_DONE,
+    REQ_ERROR,
+    REQ_INVALID_PARAM,
+    REQ_UNSUPPORT,
+} BalloonReqStatus;
+
+typedef struct GetFreePageReq {
+    uint32_t id;
+    uint32_t param;
+} MiscReq;
+
 typedef struct virtio_balloon_stat VirtIOBalloonStat;
 
 typedef struct virtio_balloon_stat_modern {
@@ -33,16 +47,19 @@ typedef struct virtio_balloon_stat_modern {
 
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
-    VirtQueue *ivq, *dvq, *svq;
+    VirtQueue *ivq, *dvq, *svq, *mvq;
     uint32_t num_pages;
     uint32_t actual;
     uint64_t stats[VIRTIO_BALLOON_S_NR];
     VirtQueueElement *stats_vq_elem;
+    VirtQueueElement *misc_vq_elem;
     size_t stats_vq_offset;
     QEMUTimer *stats_timer;
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;
+    MiscReq misc_req;
+    BalloonReqStatus req_status;
 } VirtIOBalloon;
 
 #endif
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 7c9686c..c8b254f 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -35,6 +35,7 @@
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
 #define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
+#define VIRTIO_BALLOON_F_MISC    4 /* Send request and get misc info */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index 3f976b4..0e85f2b 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -18,9 +18,12 @@
 
 typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
 typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
+typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
 
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-			     QEMUBalloonStatus *stat_func, void *opaque);
+                             QEMUBalloonStatus *stat_func,
+                             QEMUBalloonDropCache *drop_cache_func,
+                             void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
 bool qemu_balloon_is_inhibited(void);
 void qemu_balloon_inhibit(bool state);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 2/7] virtio-balloon: add drop cache support
@ 2016-06-13 10:16   ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

virtio-balloon can make use of the amount of free memory to determine
the amount of memory to be filled in the balloon, but the amount of
free memory will be effected by the page cache, which can be reclaimed.
Drop the cache before getting the amount of free memory will be very
helpful to relect the exact amount of memroy that can be reclaimed.

This patch add a new feature to the balloon device to support this
operation, hypervisor can request the VM to drop it's cache, so as to
reclaim more memory.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 balloon.c                                       | 10 ++-
 hw/virtio/virtio-balloon.c                      | 85 ++++++++++++++++++++++++-
 include/hw/virtio/virtio-balloon.h              | 19 +++++-
 include/standard-headers/linux/virtio_balloon.h |  1 +
 include/sysemu/balloon.h                        |  5 +-
 5 files changed, 115 insertions(+), 5 deletions(-)

diff --git a/balloon.c b/balloon.c
index f2ef50c..0fb34bf 100644
--- a/balloon.c
+++ b/balloon.c
@@ -36,6 +36,7 @@
 
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
+static QEMUBalloonDropCache *balloon_drop_cache_fn;
 static void *balloon_opaque;
 static bool balloon_inhibited;
 
@@ -65,9 +66,12 @@ static bool have_balloon(Error **errp)
 }
 
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-                             QEMUBalloonStatus *stat_func, void *opaque)
+                             QEMUBalloonStatus *stat_func,
+                             QEMUBalloonDropCache *drop_cache_func,
+                             void *opaque)
 {
-    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
+    if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
+        || balloon_opaque) {
         /* We're already registered one balloon handler.  How many can
          * a guest really have?
          */
@@ -75,6 +79,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
     }
     balloon_event_fn = event_func;
     balloon_stat_fn = stat_func;
+    balloon_drop_cache_fn = drop_cache_func;
     balloon_opaque = opaque;
     return 0;
 }
@@ -86,6 +91,7 @@ void qemu_remove_balloon_handler(void *opaque)
     }
     balloon_event_fn = NULL;
     balloon_stat_fn = NULL;
+    balloon_drop_cache_fn = NULL;
     balloon_opaque = NULL;
 }
 
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 8cf74c2..4757ba5 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -36,6 +36,10 @@
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
+enum balloon_req_id {
+       BALLOON_DROP_CACHE,
+};
+
 static void balloon_page(void *addr, int deflate)
 {
 #if defined(__linux__)
@@ -154,6 +158,12 @@ static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
     return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
 }
 
+static bool balloon_misc_supported(const VirtIOBalloon *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_MISC);
+}
+
 static bool balloon_stats_enabled(const VirtIOBalloon *s)
 {
     return s->stats_poll_interval > 0;
@@ -420,6 +430,39 @@ out:
     }
 }
 
+static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+    VirtQueueElement *elem;
+    size_t offset = 0;
+    uint32_t tmp32, id = 0;
+
+    elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+    if (!elem) {
+        s->req_status = REQ_ERROR;
+        return;
+    }
+
+    s->misc_vq_elem = elem;
+
+    if (!elem->out_num) {
+        return;
+    }
+
+    iov_to_buf(elem->out_sg, elem->out_num, offset,
+               &tmp32, sizeof(uint32_t));
+    id = virtio_ldl_p(vdev, &tmp32);
+    offset += sizeof(uint32_t);
+    switch (id) {
+    case BALLOON_DROP_CACHE:
+        s->req_status = REQ_DONE;
+        break;
+    default:
+        break;
+    }
+
+}
+
 static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
@@ -490,6 +533,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
     f |= dev->host_features;
     virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
     virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
+    virtio_add_feature(&f, VIRTIO_BALLOON_F_MISC);
     return f;
 }
 
@@ -500,6 +544,36 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
                                              VIRTIO_BALLOON_PFN_SHIFT);
 }
 
+static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    VirtQueueElement *elem = s->misc_vq_elem;
+    int len;
+
+    if (!balloon_misc_supported(s)) {
+        return REQ_UNSUPPORT;
+    }
+
+    if (elem == NULL || !elem->in_num) {
+        elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
+        if (!elem) {
+            return REQ_ERROR;
+        }
+        s->misc_vq_elem = elem;
+    }
+    s->misc_req.id = BALLOON_DROP_CACHE;
+    s->misc_req.param = type;
+    len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
+                       sizeof(s->misc_req));
+    virtqueue_push(s->mvq, elem, len);
+    virtio_notify(vdev, s->mvq);
+    g_free(s->misc_vq_elem);
+    s->misc_vq_elem = NULL;
+
+    return REQ_DONE;
+}
+
 static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
@@ -562,7 +636,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
                 sizeof(struct virtio_balloon_config));
 
     ret = qemu_add_balloon_handler(virtio_balloon_to_target,
-                                   virtio_balloon_stat, s);
+                                   virtio_balloon_stat,
+                                   virtio_balloon_drop_cache, s);
 
     if (ret < 0) {
         error_setg(errp, "Only one balloon device is supported");
@@ -573,8 +648,10 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
     s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
+    s->mvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_resp);
 
     reset_stats(s);
+    s->req_status = REQ_INIT;
 
     register_savevm(dev, "virtio-balloon", -1, 1,
                     virtio_balloon_save, virtio_balloon_load, s);
@@ -599,6 +676,12 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
         g_free(s->stats_vq_elem);
         s->stats_vq_elem = NULL;
     }
+
+    if (s->misc_vq_elem != NULL) {
+        g_free(s->misc_vq_elem);
+        s->misc_vq_elem = NULL;
+    }
+    s->req_status = REQ_INIT;
 }
 
 static void virtio_balloon_instance_init(Object *obj)
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 35f62ac..a21bb45 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -23,6 +23,20 @@
 #define VIRTIO_BALLOON(obj) \
         OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
 
+typedef enum {
+    REQ_INIT,
+    REQ_ON_GOING,
+    REQ_DONE,
+    REQ_ERROR,
+    REQ_INVALID_PARAM,
+    REQ_UNSUPPORT,
+} BalloonReqStatus;
+
+typedef struct GetFreePageReq {
+    uint32_t id;
+    uint32_t param;
+} MiscReq;
+
 typedef struct virtio_balloon_stat VirtIOBalloonStat;
 
 typedef struct virtio_balloon_stat_modern {
@@ -33,16 +47,19 @@ typedef struct virtio_balloon_stat_modern {
 
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
-    VirtQueue *ivq, *dvq, *svq;
+    VirtQueue *ivq, *dvq, *svq, *mvq;
     uint32_t num_pages;
     uint32_t actual;
     uint64_t stats[VIRTIO_BALLOON_S_NR];
     VirtQueueElement *stats_vq_elem;
+    VirtQueueElement *misc_vq_elem;
     size_t stats_vq_offset;
     QEMUTimer *stats_timer;
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;
+    MiscReq misc_req;
+    BalloonReqStatus req_status;
 } VirtIOBalloon;
 
 #endif
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 7c9686c..c8b254f 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -35,6 +35,7 @@
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
 #define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
+#define VIRTIO_BALLOON_F_MISC    4 /* Send request and get misc info */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index 3f976b4..0e85f2b 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -18,9 +18,12 @@
 
 typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
 typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
+typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
 
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
-			     QEMUBalloonStatus *stat_func, void *opaque);
+                             QEMUBalloonStatus *stat_func,
+                             QEMUBalloonDropCache *drop_cache_func,
+                             void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
 bool qemu_balloon_is_inhibited(void);
 void qemu_balloon_inhibit(bool state);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 10:16 ` [Qemu-devel] " Liang Li
@ 2016-06-13 10:16   ` Liang Li
  -1 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Add the hmp and qmp interface to drop vm's page cache, users
can control the type of cache they want vm to drop.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 balloon.c        | 19 +++++++++++++++++++
 hmp-commands.hx  | 15 +++++++++++++++
 hmp.c            | 22 ++++++++++++++++++++++
 hmp.h            |  3 +++
 monitor.c        | 18 ++++++++++++++++++
 qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++
 qmp-commands.hx  | 23 +++++++++++++++++++++++
 7 files changed, 135 insertions(+)

diff --git a/balloon.c b/balloon.c
index 0fb34bf..3d96111 100644
--- a/balloon.c
+++ b/balloon.c
@@ -122,3 +122,22 @@ void qmp_balloon(int64_t target, Error **errp)
     trace_balloon_event(balloon_opaque, target);
     balloon_event_fn(balloon_opaque, target);
 }
+
+void qmp_balloon_drop_cache(DropCacheType type, Error **errp)
+{
+    if (!have_balloon(errp)) {
+        return;
+    }
+
+    if (!balloon_drop_cache_fn) {
+        error_setg(errp, QERR_UNSUPPORTED);
+        return;
+    }
+    if (type < 0 && type >= DROP_CACHE_TYPE__MAX) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "type",
+                   "a value in range[0, 3]");
+        return;
+    }
+
+    balloon_drop_cache_fn(balloon_opaque, type);
+}
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 98b4b1a..c73572c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1378,6 +1378,21 @@ Request VM to change its memory allocation to @var{value} (in MB).
 ETEXI
 
     {
+        .name       = "balloon_drop_cache",
+        .args_type  = "type:s",
+        .params     = "type",
+        .help       = "request VM to drop its page caches",
+        .mhandler.cmd = hmp_balloon_drop_cache,
+        .command_completion = balloon_drop_cache_completion
+    },
+
+STEXI
+@item balloon_drop_cache @var{type}
+@findex balloon_drop_cache
+Request VM to dorp its page caches.
+ETEXI
+
+    {
         .name       = "set_link",
         .args_type  = "name:s,up:b",
         .params     = "name on|off",
diff --git a/hmp.c b/hmp.c
index a4b1d3d..3aa1062 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1061,6 +1061,28 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
     }
 }
 
+void hmp_balloon_drop_cache(Monitor *mon, const QDict *qdict)
+{
+    const char *type = qdict_get_str(qdict, "type");
+    Error *err = NULL;
+    int i;
+
+    for (i = 0; i < DROP_CACHE_TYPE__MAX; i++) {
+        if (strcmp(type, DropCacheType_lookup[i]) == 0) {
+            qmp_balloon_drop_cache(1 + i, &err);
+            break;
+        }
+    }
+
+    if (i == DROP_CACHE_TYPE__MAX) {
+        error_setg(&err, QERR_INVALID_PARAMETER, type);
+    }
+
+    if (err) {
+        error_report_err(err);
+    }
+}
+
 void hmp_block_resize(Monitor *mon, const QDict *qdict)
 {
     const char *device = qdict_get_str(qdict, "device");
diff --git a/hmp.h b/hmp.h
index 093d65f..6bb6499 100644
--- a/hmp.h
+++ b/hmp.h
@@ -55,6 +55,7 @@ void hmp_nmi(Monitor *mon, const QDict *qdict);
 void hmp_set_link(Monitor *mon, const QDict *qdict);
 void hmp_block_passwd(Monitor *mon, const QDict *qdict);
 void hmp_balloon(Monitor *mon, const QDict *qdict);
+void hmp_balloon_drop_cache(Monitor *mon, const QDict *qdict);
 void hmp_block_resize(Monitor *mon, const QDict *qdict);
 void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict);
 void hmp_snapshot_blkdev_internal(Monitor *mon, const QDict *qdict);
@@ -120,6 +121,8 @@ void watchdog_action_completion(ReadLineState *rs, int nb_args,
                                 const char *str);
 void migrate_set_capability_completion(ReadLineState *rs, int nb_args,
                                        const char *str);
+void balloon_drop_cache_completion(ReadLineState *rs, int nb_args,
+                                   const char *str);
 void migrate_set_parameter_completion(ReadLineState *rs, int nb_args,
                                       const char *str);
 void host_net_add_completion(ReadLineState *rs, int nb_args, const char *str);
diff --git a/monitor.c b/monitor.c
index a27e115..eefdf3d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3367,6 +3367,24 @@ void migrate_set_parameter_completion(ReadLineState *rs, int nb_args,
     }
 }
 
+void balloon_drop_cache_completion(ReadLineState *rs, int nb_args,
+                                   const char *str)
+{
+    size_t len;
+
+    len = strlen(str);
+    readline_set_completion_index(rs, len);
+    if (nb_args == 2) {
+        int i;
+        for (i = 0; i < DROP_CACHE_TYPE__MAX; i++) {
+            const char *name = DropCacheType_lookup[i];
+            if (!strncmp(str, name, len)) {
+                readline_add_completion(rs, name);
+            }
+        }
+    }
+}
+
 void host_net_add_completion(ReadLineState *rs, int nb_args, const char *str)
 {
     int i;
diff --git a/qapi-schema.json b/qapi-schema.json
index 8483bdf..117f70a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1655,6 +1655,41 @@
 { 'command': 'balloon', 'data': {'value': 'int'} }
 
 ##
+# @DropCacheType
+#
+# Cache types enumeration
+#
+# @clean: Drop the clean page cache.
+#
+# @slab: Drop the slab cache.
+#
+# @all: Drop both the clean and the slab cache.
+#
+# Since: 2.7
+##
+{ 'enum': 'DropCacheType', 'data': ['clean', 'slab', 'all'] }
+
+##
+# @balloon_drop_cache:
+#
+# Request the vm to drop its cache.
+#
+# @value: the type of cache want vm to drop
+#
+# Returns: Nothing on success
+#          If the balloon driver is enabled but not functional because the KVM
+#            kernel module cannot support it, KvmMissingCap
+#          If no balloon device is present, DeviceNotActive
+#
+# Notes: This command just issues a request to the guest.  When it returns,
+#        the drop cache operation may not have completed.  A guest can drop its
+#        cache independent of this command.
+#
+# Since: 2.7.0
+##
+{ 'command': 'balloon_drop_cache', 'data': {'value': 'DropCacheType'} }
+
+##
 # @Abort
 #
 # This action can be used to test transaction failure.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 28801a2..6650ba0 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1802,6 +1802,29 @@ Example:
 EQMP
 
     {
+        .name       = "balloon_drop_cache",
+        .args_type  = "value:i",
+        .mhandler.cmd_new = qmp_marshal_balloon_drop_cache,
+    },
+
+SQMP
+balloon_drop_cache
+-------
+
+Request VM to drop its cache.
+
+Arguments:
+
+- "value": cache type to drop (json-int)
+
+Example:
+
+-> { "execute": "balloon_drop_cache", "arguments": { "value": 1 } }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "set_link",
         .args_type  = "name:s,up:b",
         .mhandler.cmd_new = qmp_marshal_set_link,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
@ 2016-06-13 10:16   ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Add the hmp and qmp interface to drop vm's page cache, users
can control the type of cache they want vm to drop.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 balloon.c        | 19 +++++++++++++++++++
 hmp-commands.hx  | 15 +++++++++++++++
 hmp.c            | 22 ++++++++++++++++++++++
 hmp.h            |  3 +++
 monitor.c        | 18 ++++++++++++++++++
 qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++
 qmp-commands.hx  | 23 +++++++++++++++++++++++
 7 files changed, 135 insertions(+)

diff --git a/balloon.c b/balloon.c
index 0fb34bf..3d96111 100644
--- a/balloon.c
+++ b/balloon.c
@@ -122,3 +122,22 @@ void qmp_balloon(int64_t target, Error **errp)
     trace_balloon_event(balloon_opaque, target);
     balloon_event_fn(balloon_opaque, target);
 }
+
+void qmp_balloon_drop_cache(DropCacheType type, Error **errp)
+{
+    if (!have_balloon(errp)) {
+        return;
+    }
+
+    if (!balloon_drop_cache_fn) {
+        error_setg(errp, QERR_UNSUPPORTED);
+        return;
+    }
+    if (type < 0 && type >= DROP_CACHE_TYPE__MAX) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "type",
+                   "a value in range[0, 3]");
+        return;
+    }
+
+    balloon_drop_cache_fn(balloon_opaque, type);
+}
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 98b4b1a..c73572c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1378,6 +1378,21 @@ Request VM to change its memory allocation to @var{value} (in MB).
 ETEXI
 
     {
+        .name       = "balloon_drop_cache",
+        .args_type  = "type:s",
+        .params     = "type",
+        .help       = "request VM to drop its page caches",
+        .mhandler.cmd = hmp_balloon_drop_cache,
+        .command_completion = balloon_drop_cache_completion
+    },
+
+STEXI
+@item balloon_drop_cache @var{type}
+@findex balloon_drop_cache
+Request VM to dorp its page caches.
+ETEXI
+
+    {
         .name       = "set_link",
         .args_type  = "name:s,up:b",
         .params     = "name on|off",
diff --git a/hmp.c b/hmp.c
index a4b1d3d..3aa1062 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1061,6 +1061,28 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
     }
 }
 
+void hmp_balloon_drop_cache(Monitor *mon, const QDict *qdict)
+{
+    const char *type = qdict_get_str(qdict, "type");
+    Error *err = NULL;
+    int i;
+
+    for (i = 0; i < DROP_CACHE_TYPE__MAX; i++) {
+        if (strcmp(type, DropCacheType_lookup[i]) == 0) {
+            qmp_balloon_drop_cache(1 + i, &err);
+            break;
+        }
+    }
+
+    if (i == DROP_CACHE_TYPE__MAX) {
+        error_setg(&err, QERR_INVALID_PARAMETER, type);
+    }
+
+    if (err) {
+        error_report_err(err);
+    }
+}
+
 void hmp_block_resize(Monitor *mon, const QDict *qdict)
 {
     const char *device = qdict_get_str(qdict, "device");
diff --git a/hmp.h b/hmp.h
index 093d65f..6bb6499 100644
--- a/hmp.h
+++ b/hmp.h
@@ -55,6 +55,7 @@ void hmp_nmi(Monitor *mon, const QDict *qdict);
 void hmp_set_link(Monitor *mon, const QDict *qdict);
 void hmp_block_passwd(Monitor *mon, const QDict *qdict);
 void hmp_balloon(Monitor *mon, const QDict *qdict);
+void hmp_balloon_drop_cache(Monitor *mon, const QDict *qdict);
 void hmp_block_resize(Monitor *mon, const QDict *qdict);
 void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict);
 void hmp_snapshot_blkdev_internal(Monitor *mon, const QDict *qdict);
@@ -120,6 +121,8 @@ void watchdog_action_completion(ReadLineState *rs, int nb_args,
                                 const char *str);
 void migrate_set_capability_completion(ReadLineState *rs, int nb_args,
                                        const char *str);
+void balloon_drop_cache_completion(ReadLineState *rs, int nb_args,
+                                   const char *str);
 void migrate_set_parameter_completion(ReadLineState *rs, int nb_args,
                                       const char *str);
 void host_net_add_completion(ReadLineState *rs, int nb_args, const char *str);
diff --git a/monitor.c b/monitor.c
index a27e115..eefdf3d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3367,6 +3367,24 @@ void migrate_set_parameter_completion(ReadLineState *rs, int nb_args,
     }
 }
 
+void balloon_drop_cache_completion(ReadLineState *rs, int nb_args,
+                                   const char *str)
+{
+    size_t len;
+
+    len = strlen(str);
+    readline_set_completion_index(rs, len);
+    if (nb_args == 2) {
+        int i;
+        for (i = 0; i < DROP_CACHE_TYPE__MAX; i++) {
+            const char *name = DropCacheType_lookup[i];
+            if (!strncmp(str, name, len)) {
+                readline_add_completion(rs, name);
+            }
+        }
+    }
+}
+
 void host_net_add_completion(ReadLineState *rs, int nb_args, const char *str)
 {
     int i;
diff --git a/qapi-schema.json b/qapi-schema.json
index 8483bdf..117f70a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1655,6 +1655,41 @@
 { 'command': 'balloon', 'data': {'value': 'int'} }
 
 ##
+# @DropCacheType
+#
+# Cache types enumeration
+#
+# @clean: Drop the clean page cache.
+#
+# @slab: Drop the slab cache.
+#
+# @all: Drop both the clean and the slab cache.
+#
+# Since: 2.7
+##
+{ 'enum': 'DropCacheType', 'data': ['clean', 'slab', 'all'] }
+
+##
+# @balloon_drop_cache:
+#
+# Request the vm to drop its cache.
+#
+# @value: the type of cache want vm to drop
+#
+# Returns: Nothing on success
+#          If the balloon driver is enabled but not functional because the KVM
+#            kernel module cannot support it, KvmMissingCap
+#          If no balloon device is present, DeviceNotActive
+#
+# Notes: This command just issues a request to the guest.  When it returns,
+#        the drop cache operation may not have completed.  A guest can drop its
+#        cache independent of this command.
+#
+# Since: 2.7.0
+##
+{ 'command': 'balloon_drop_cache', 'data': {'value': 'DropCacheType'} }
+
+##
 # @Abort
 #
 # This action can be used to test transaction failure.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 28801a2..6650ba0 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1802,6 +1802,29 @@ Example:
 EQMP
 
     {
+        .name       = "balloon_drop_cache",
+        .args_type  = "value:i",
+        .mhandler.cmd_new = qmp_marshal_balloon_drop_cache,
+    },
+
+SQMP
+balloon_drop_cache
+-------
+
+Request VM to drop its cache.
+
+Arguments:
+
+- "value": cache type to drop (json-int)
+
+Example:
+
+-> { "execute": "balloon_drop_cache", "arguments": { "value": 1 } }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "set_link",
         .args_type  = "name:s,up:b",
         .mhandler.cmd_new = qmp_marshal_set_link,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [QEMU 4/7] balloon: get free page info from guest
  2016-06-13 10:16 ` [Qemu-devel] " Liang Li
@ 2016-06-13 10:16   ` Liang Li
  -1 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Add a new feature to get the free page information from guest,
the free page information is saved in a bitmap. Please note that
'free page' only means these pages are free before the request,
some of the pages will become no free during the process of
sending the free page bitmap to QEMU.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 balloon.c                          | 24 +++++++++++-
 hw/virtio/virtio-balloon.c         | 75 +++++++++++++++++++++++++++++++++++++-
 include/hw/virtio/virtio-balloon.h |  4 ++
 include/sysemu/balloon.h           |  8 ++++
 4 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/balloon.c b/balloon.c
index 3d96111..c74c472 100644
--- a/balloon.c
+++ b/balloon.c
@@ -37,6 +37,7 @@
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
 static QEMUBalloonDropCache *balloon_drop_cache_fn;
+static QEMUBalloonGetFreePage *balloon_get_free_page_fn;
 static void *balloon_opaque;
 static bool balloon_inhibited;
 
@@ -68,10 +69,11 @@ static bool have_balloon(Error **errp)
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
                              QEMUBalloonStatus *stat_func,
                              QEMUBalloonDropCache *drop_cache_func,
+                             QEMUBalloonGetFreePage *get_free_page_func,
                              void *opaque)
 {
     if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
-        || balloon_opaque) {
+        || balloon_get_free_page_fn || balloon_opaque) {
         /* We're already registered one balloon handler.  How many can
          * a guest really have?
          */
@@ -80,6 +82,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
     balloon_event_fn = event_func;
     balloon_stat_fn = stat_func;
     balloon_drop_cache_fn = drop_cache_func;
+    balloon_get_free_page_fn = get_free_page_func;
     balloon_opaque = opaque;
     return 0;
 }
@@ -92,6 +95,7 @@ void qemu_remove_balloon_handler(void *opaque)
     balloon_event_fn = NULL;
     balloon_stat_fn = NULL;
     balloon_drop_cache_fn = NULL;
+    balloon_get_free_page_fn = NULL;
     balloon_opaque = NULL;
 }
 
@@ -141,3 +145,21 @@ void qmp_balloon_drop_cache(DropCacheType type, Error **errp)
 
     balloon_drop_cache_fn(balloon_opaque, type);
 }
+
+bool balloon_free_pages_support(void)
+{
+    return balloon_get_free_page_fn ? true : false;
+}
+
+BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap, unsigned long len)
+{
+    if (!balloon_get_free_page_fn) {
+        return REQ_UNSUPPORT;
+    }
+
+    if (!bitmap) {
+        return REQ_INVALID_PARAM;
+    }
+
+    return balloon_get_free_page_fn(balloon_opaque, bitmap, len);
+}
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 4757ba5..30ba074 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -38,6 +38,7 @@
 
 enum balloon_req_id {
        BALLOON_DROP_CACHE,
+       BALLOON_GET_FREE_PAGES,
 };
 
 static void balloon_page(void *addr, int deflate)
@@ -435,7 +436,8 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
     VirtQueueElement *elem;
     size_t offset = 0;
-    uint32_t tmp32, id = 0;
+    uint32_t tmp32, id = 0, page_shift;
+    uint64_t base_pfn, tmp64, bmap_len;
 
     elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
     if (!elem) {
@@ -457,6 +459,32 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
     case BALLOON_DROP_CACHE:
         s->req_status = REQ_DONE;
         break;
+    case BALLOON_GET_FREE_PAGES:
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   &tmp32, sizeof(uint32_t));
+        page_shift = virtio_ldl_p(vdev, &tmp32);
+        offset += sizeof(uint32_t);
+        s->page_shift = page_shift;
+
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   &tmp64, sizeof(uint64_t));
+        base_pfn = virtio_ldq_p(vdev, &tmp64);
+        offset += sizeof(uint64_t);
+        s->base_pfn = base_pfn;
+
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   &tmp64, sizeof(uint64_t));
+        bmap_len = virtio_ldq_p(vdev, &tmp64);
+        offset += sizeof(uint64_t);
+        if (s->bmap_len < bmap_len) {
+             s->req_status = REQ_INVALID_PARAM;
+             return;
+        }
+
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   s->free_page_bmap, bmap_len);
+        s->req_status = REQ_DONE;
+       break;
     default:
         break;
     }
@@ -574,6 +602,48 @@ static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
     return REQ_DONE;
 }
 
+static BalloonReqStatus virtio_balloon_free_pages(void *opaque,
+                                                  unsigned long *bitmap,
+                                                  unsigned long bmap_len)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    VirtQueueElement *elem = s->misc_vq_elem;
+    int len;
+
+    if (!balloon_misc_supported(s)) {
+        return REQ_UNSUPPORT;
+    }
+
+    if (s->req_status == REQ_INIT) {
+        s->free_page_bmap = bitmap;
+        if (elem == NULL || !elem->in_num) {
+            elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
+            if (!elem) {
+                return REQ_ERROR;
+            }
+            s->misc_vq_elem = elem;
+        }
+        s->misc_req.id = BALLOON_GET_FREE_PAGES;
+        s->misc_req.param = 0;
+        s->bmap_len = bmap_len;
+        len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
+                           sizeof(s->misc_req));
+        virtqueue_push(s->mvq, elem, len);
+        virtio_notify(vdev, s->mvq);
+        g_free(s->misc_vq_elem);
+        s->misc_vq_elem = NULL;
+        s->req_status = REQ_ON_GOING;
+        return REQ_ERROR;
+    } else if (s->req_status == REQ_ON_GOING) {
+        return REQ_ON_GOING;
+    } else if (s->req_status == REQ_DONE) {
+        s->req_status = REQ_INIT;
+    }
+
+    return REQ_DONE;
+}
+
 static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
@@ -637,7 +707,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
 
     ret = qemu_add_balloon_handler(virtio_balloon_to_target,
                                    virtio_balloon_stat,
-                                   virtio_balloon_drop_cache, s);
+                                   virtio_balloon_drop_cache,
+                                   virtio_balloon_free_pages, s);
 
     if (ret < 0) {
         error_setg(errp, "Only one balloon device is supported");
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index a21bb45..6382bcf 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -60,6 +60,10 @@ typedef struct VirtIOBalloon {
     uint32_t host_features;
     MiscReq misc_req;
     BalloonReqStatus req_status;
+    uint64_t *free_page_bmap;
+    uint64_t bmap_len;
+    uint64_t base_pfn;
+    uint32_t page_shift;
 } VirtIOBalloon;
 
 #endif
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index 0e85f2b..6c362e8 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -15,17 +15,25 @@
 #define _QEMU_BALLOON_H
 
 #include "qapi-types.h"
+#include "hw/virtio/virtio-balloon.h"
 
 typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
 typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
 typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
+typedef BalloonReqStatus (QEMUBalloonGetFreePage)(void *opaque,
+                                                  unsigned long *bitmap,
+                                                  unsigned long len);
 
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
                              QEMUBalloonStatus *stat_func,
                              QEMUBalloonDropCache *drop_cache_func,
+                             QEMUBalloonGetFreePage *get_free_page_func,
                              void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
 bool qemu_balloon_is_inhibited(void);
 void qemu_balloon_inhibit(bool state);
+bool balloon_free_pages_support(void);
+BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap,
+                                        unsigned long len);
 
 #endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 4/7] balloon: get free page info from guest
@ 2016-06-13 10:16   ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Add a new feature to get the free page information from guest,
the free page information is saved in a bitmap. Please note that
'free page' only means these pages are free before the request,
some of the pages will become no free during the process of
sending the free page bitmap to QEMU.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 balloon.c                          | 24 +++++++++++-
 hw/virtio/virtio-balloon.c         | 75 +++++++++++++++++++++++++++++++++++++-
 include/hw/virtio/virtio-balloon.h |  4 ++
 include/sysemu/balloon.h           |  8 ++++
 4 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/balloon.c b/balloon.c
index 3d96111..c74c472 100644
--- a/balloon.c
+++ b/balloon.c
@@ -37,6 +37,7 @@
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
 static QEMUBalloonDropCache *balloon_drop_cache_fn;
+static QEMUBalloonGetFreePage *balloon_get_free_page_fn;
 static void *balloon_opaque;
 static bool balloon_inhibited;
 
@@ -68,10 +69,11 @@ static bool have_balloon(Error **errp)
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
                              QEMUBalloonStatus *stat_func,
                              QEMUBalloonDropCache *drop_cache_func,
+                             QEMUBalloonGetFreePage *get_free_page_func,
                              void *opaque)
 {
     if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
-        || balloon_opaque) {
+        || balloon_get_free_page_fn || balloon_opaque) {
         /* We're already registered one balloon handler.  How many can
          * a guest really have?
          */
@@ -80,6 +82,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
     balloon_event_fn = event_func;
     balloon_stat_fn = stat_func;
     balloon_drop_cache_fn = drop_cache_func;
+    balloon_get_free_page_fn = get_free_page_func;
     balloon_opaque = opaque;
     return 0;
 }
@@ -92,6 +95,7 @@ void qemu_remove_balloon_handler(void *opaque)
     balloon_event_fn = NULL;
     balloon_stat_fn = NULL;
     balloon_drop_cache_fn = NULL;
+    balloon_get_free_page_fn = NULL;
     balloon_opaque = NULL;
 }
 
@@ -141,3 +145,21 @@ void qmp_balloon_drop_cache(DropCacheType type, Error **errp)
 
     balloon_drop_cache_fn(balloon_opaque, type);
 }
+
+bool balloon_free_pages_support(void)
+{
+    return balloon_get_free_page_fn ? true : false;
+}
+
+BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap, unsigned long len)
+{
+    if (!balloon_get_free_page_fn) {
+        return REQ_UNSUPPORT;
+    }
+
+    if (!bitmap) {
+        return REQ_INVALID_PARAM;
+    }
+
+    return balloon_get_free_page_fn(balloon_opaque, bitmap, len);
+}
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 4757ba5..30ba074 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -38,6 +38,7 @@
 
 enum balloon_req_id {
        BALLOON_DROP_CACHE,
+       BALLOON_GET_FREE_PAGES,
 };
 
 static void balloon_page(void *addr, int deflate)
@@ -435,7 +436,8 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
     VirtQueueElement *elem;
     size_t offset = 0;
-    uint32_t tmp32, id = 0;
+    uint32_t tmp32, id = 0, page_shift;
+    uint64_t base_pfn, tmp64, bmap_len;
 
     elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
     if (!elem) {
@@ -457,6 +459,32 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
     case BALLOON_DROP_CACHE:
         s->req_status = REQ_DONE;
         break;
+    case BALLOON_GET_FREE_PAGES:
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   &tmp32, sizeof(uint32_t));
+        page_shift = virtio_ldl_p(vdev, &tmp32);
+        offset += sizeof(uint32_t);
+        s->page_shift = page_shift;
+
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   &tmp64, sizeof(uint64_t));
+        base_pfn = virtio_ldq_p(vdev, &tmp64);
+        offset += sizeof(uint64_t);
+        s->base_pfn = base_pfn;
+
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   &tmp64, sizeof(uint64_t));
+        bmap_len = virtio_ldq_p(vdev, &tmp64);
+        offset += sizeof(uint64_t);
+        if (s->bmap_len < bmap_len) {
+             s->req_status = REQ_INVALID_PARAM;
+             return;
+        }
+
+        iov_to_buf(elem->out_sg, elem->out_num, offset,
+                   s->free_page_bmap, bmap_len);
+        s->req_status = REQ_DONE;
+       break;
     default:
         break;
     }
@@ -574,6 +602,48 @@ static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
     return REQ_DONE;
 }
 
+static BalloonReqStatus virtio_balloon_free_pages(void *opaque,
+                                                  unsigned long *bitmap,
+                                                  unsigned long bmap_len)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    VirtQueueElement *elem = s->misc_vq_elem;
+    int len;
+
+    if (!balloon_misc_supported(s)) {
+        return REQ_UNSUPPORT;
+    }
+
+    if (s->req_status == REQ_INIT) {
+        s->free_page_bmap = bitmap;
+        if (elem == NULL || !elem->in_num) {
+            elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
+            if (!elem) {
+                return REQ_ERROR;
+            }
+            s->misc_vq_elem = elem;
+        }
+        s->misc_req.id = BALLOON_GET_FREE_PAGES;
+        s->misc_req.param = 0;
+        s->bmap_len = bmap_len;
+        len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
+                           sizeof(s->misc_req));
+        virtqueue_push(s->mvq, elem, len);
+        virtio_notify(vdev, s->mvq);
+        g_free(s->misc_vq_elem);
+        s->misc_vq_elem = NULL;
+        s->req_status = REQ_ON_GOING;
+        return REQ_ERROR;
+    } else if (s->req_status == REQ_ON_GOING) {
+        return REQ_ON_GOING;
+    } else if (s->req_status == REQ_DONE) {
+        s->req_status = REQ_INIT;
+    }
+
+    return REQ_DONE;
+}
+
 static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
@@ -637,7 +707,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
 
     ret = qemu_add_balloon_handler(virtio_balloon_to_target,
                                    virtio_balloon_stat,
-                                   virtio_balloon_drop_cache, s);
+                                   virtio_balloon_drop_cache,
+                                   virtio_balloon_free_pages, s);
 
     if (ret < 0) {
         error_setg(errp, "Only one balloon device is supported");
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index a21bb45..6382bcf 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -60,6 +60,10 @@ typedef struct VirtIOBalloon {
     uint32_t host_features;
     MiscReq misc_req;
     BalloonReqStatus req_status;
+    uint64_t *free_page_bmap;
+    uint64_t bmap_len;
+    uint64_t base_pfn;
+    uint32_t page_shift;
 } VirtIOBalloon;
 
 #endif
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index 0e85f2b..6c362e8 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -15,17 +15,25 @@
 #define _QEMU_BALLOON_H
 
 #include "qapi-types.h"
+#include "hw/virtio/virtio-balloon.h"
 
 typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
 typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
 typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
+typedef BalloonReqStatus (QEMUBalloonGetFreePage)(void *opaque,
+                                                  unsigned long *bitmap,
+                                                  unsigned long len);
 
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
                              QEMUBalloonStatus *stat_func,
                              QEMUBalloonDropCache *drop_cache_func,
+                             QEMUBalloonGetFreePage *get_free_page_func,
                              void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
 bool qemu_balloon_is_inhibited(void);
 void qemu_balloon_inhibit(bool state);
+bool balloon_free_pages_support(void);
+BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap,
+                                        unsigned long len);
 
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [QEMU 5/7] bitmap: Add a new bitmap_move function
  2016-06-13 10:16 ` [Qemu-devel] " Liang Li
@ 2016-06-13 10:16   ` Liang Li
  -1 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Sometimes, it is need to move a portion of bitmap to another place
in a large bitmap, if overlap happens, the bitmap_copy can't not
work correctly, we need a new function to do this work.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 include/qemu/bitmap.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index ec5146f..6ac89ca 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -37,6 +37,7 @@
  * bitmap_set(dst, pos, nbits)			Set specified bit area
  * bitmap_set_atomic(dst, pos, nbits)   Set specified bit area with atomic ops
  * bitmap_clear(dst, pos, nbits)		Clear specified bit area
+ * bitmap_move(dst, src, nbits)                 Move *src to *dst
  * bitmap_test_and_clear_atomic(dst, pos, nbits)    Test and clear area
  * bitmap_find_next_zero_area(buf, len, pos, n, mask)	Find bit free area
  */
@@ -136,6 +137,18 @@ static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
     }
 }
 
+static inline void bitmap_move(unsigned long *dst, const unsigned long *src,
+                               long nbits)
+{
+    if (small_nbits(nbits)) {
+        unsigned long tmp = *src;
+        *dst = tmp;
+    } else {
+        long len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+        memmove(dst, src, len);
+    }
+}
+
 static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
                              const unsigned long *src2, long nbits)
 {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 5/7] bitmap: Add a new bitmap_move function
@ 2016-06-13 10:16   ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Sometimes, it is need to move a portion of bitmap to another place
in a large bitmap, if overlap happens, the bitmap_copy can't not
work correctly, we need a new function to do this work.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 include/qemu/bitmap.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index ec5146f..6ac89ca 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -37,6 +37,7 @@
  * bitmap_set(dst, pos, nbits)			Set specified bit area
  * bitmap_set_atomic(dst, pos, nbits)   Set specified bit area with atomic ops
  * bitmap_clear(dst, pos, nbits)		Clear specified bit area
+ * bitmap_move(dst, src, nbits)                 Move *src to *dst
  * bitmap_test_and_clear_atomic(dst, pos, nbits)    Test and clear area
  * bitmap_find_next_zero_area(buf, len, pos, n, mask)	Find bit free area
  */
@@ -136,6 +137,18 @@ static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
     }
 }
 
+static inline void bitmap_move(unsigned long *dst, const unsigned long *src,
+                               long nbits)
+{
+    if (small_nbits(nbits)) {
+        unsigned long tmp = *src;
+        *dst = tmp;
+    } else {
+        long len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+        memmove(dst, src, len);
+    }
+}
+
 static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
                              const unsigned long *src2, long nbits)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [QEMU 6/7] kvm: Add two new arch specific functions
  2016-06-13 10:16 ` [Qemu-devel] " Liang Li
@ 2016-06-13 10:16   ` Liang Li
  -1 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Add a new function to get the vm's max pfn and a new function
to filter out the holes to get a tight free page bitmap.
They are implemented on X86, and all the arches should implement
them for live migration optimization.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 include/sysemu/kvm.h |  2 ++
 target-arm/kvm.c     | 14 ++++++++++++++
 target-i386/kvm.c    | 35 +++++++++++++++++++++++++++++++++++
 target-mips/kvm.c    | 14 ++++++++++++++
 target-ppc/kvm.c     | 14 ++++++++++++++
 target-s390x/kvm.c   | 14 ++++++++++++++
 6 files changed, 93 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index ad6f837..50915f9 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -230,6 +230,8 @@ int kvm_remove_breakpoint(CPUState *cpu, target_ulong addr,
                           target_ulong len, int type);
 void kvm_remove_all_breakpoints(CPUState *cpu);
 int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap);
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap);
+unsigned long get_guest_max_pfn(void);
 #ifndef _WIN32
 int kvm_set_signal_mask(CPUState *cpu, const sigset_t *sigset);
 #endif
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 83da447..6464542 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -627,3 +627,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     return (data - 32) & 0xffff;
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index abf50e6..0b394cb 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -3327,3 +3327,38 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     abort();
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    PCMachineState *pcms = PC_MACHINE(current_machine);
+    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
+    unsigned long max_pfn;
+
+    if (above_4g_mem) {
+        max_pfn = ((1ULL << 32) + above_4g_mem) >> TARGET_PAGE_BITS;
+    } else {
+        max_pfn = pcms->below_4g_mem_size >> TARGET_PAGE_BITS;
+    }
+
+    return max_pfn;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    PCMachineState *pcms = PC_MACHINE(current_machine);
+    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
+
+    if (above_4g_mem) {
+        unsigned long *src, *dst, len, pos;
+        ram_addr_t below_4g_mem = pcms->below_4g_mem_size;
+        src = bmap + ((1ULL << 32) >> TARGET_PAGE_BITS) / BITS_PER_LONG;
+        dst = bmap + (below_4g_mem >> TARGET_PAGE_BITS) / BITS_PER_LONG;
+        bitmap_move(dst, src, above_4g_mem >> TARGET_PAGE_BITS);
+
+        pos = (above_4g_mem + below_4g_mem) >> TARGET_PAGE_BITS;
+        len = ((1ULL << 32) - below_4g_mem) >> TARGET_PAGE_BITS;
+        bitmap_clear(bmap, pos, len);
+    }
+
+    return bmap;
+}
diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index a854e4d..89a54e5 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -1048,3 +1048,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     abort();
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 24d6032..e222b31 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -2579,3 +2579,17 @@ int kvmppc_enable_hwrng(void)
 
     return kvmppc_enable_hcall(kvm_state, H_RANDOM);
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 8f46fd0..893755b 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -2271,3 +2271,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     abort();
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 6/7] kvm: Add two new arch specific functions
@ 2016-06-13 10:16   ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

Add a new function to get the vm's max pfn and a new function
to filter out the holes to get a tight free page bitmap.
They are implemented on X86, and all the arches should implement
them for live migration optimization.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 include/sysemu/kvm.h |  2 ++
 target-arm/kvm.c     | 14 ++++++++++++++
 target-i386/kvm.c    | 35 +++++++++++++++++++++++++++++++++++
 target-mips/kvm.c    | 14 ++++++++++++++
 target-ppc/kvm.c     | 14 ++++++++++++++
 target-s390x/kvm.c   | 14 ++++++++++++++
 6 files changed, 93 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index ad6f837..50915f9 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -230,6 +230,8 @@ int kvm_remove_breakpoint(CPUState *cpu, target_ulong addr,
                           target_ulong len, int type);
 void kvm_remove_all_breakpoints(CPUState *cpu);
 int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap);
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap);
+unsigned long get_guest_max_pfn(void);
 #ifndef _WIN32
 int kvm_set_signal_mask(CPUState *cpu, const sigset_t *sigset);
 #endif
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 83da447..6464542 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -627,3 +627,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     return (data - 32) & 0xffff;
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index abf50e6..0b394cb 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -3327,3 +3327,38 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     abort();
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    PCMachineState *pcms = PC_MACHINE(current_machine);
+    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
+    unsigned long max_pfn;
+
+    if (above_4g_mem) {
+        max_pfn = ((1ULL << 32) + above_4g_mem) >> TARGET_PAGE_BITS;
+    } else {
+        max_pfn = pcms->below_4g_mem_size >> TARGET_PAGE_BITS;
+    }
+
+    return max_pfn;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    PCMachineState *pcms = PC_MACHINE(current_machine);
+    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
+
+    if (above_4g_mem) {
+        unsigned long *src, *dst, len, pos;
+        ram_addr_t below_4g_mem = pcms->below_4g_mem_size;
+        src = bmap + ((1ULL << 32) >> TARGET_PAGE_BITS) / BITS_PER_LONG;
+        dst = bmap + (below_4g_mem >> TARGET_PAGE_BITS) / BITS_PER_LONG;
+        bitmap_move(dst, src, above_4g_mem >> TARGET_PAGE_BITS);
+
+        pos = (above_4g_mem + below_4g_mem) >> TARGET_PAGE_BITS;
+        len = ((1ULL << 32) - below_4g_mem) >> TARGET_PAGE_BITS;
+        bitmap_clear(bmap, pos, len);
+    }
+
+    return bmap;
+}
diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index a854e4d..89a54e5 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -1048,3 +1048,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     abort();
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 24d6032..e222b31 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -2579,3 +2579,17 @@ int kvmppc_enable_hwrng(void)
 
     return kvmppc_enable_hcall(kvm_state, H_RANDOM);
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 8f46fd0..893755b 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -2271,3 +2271,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     abort();
 }
+
+unsigned long get_guest_max_pfn(void)
+{
+    /* To be done */
+
+    return 0;
+}
+
+unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
+{
+    /* To be done */
+
+    return bmap;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [QEMU 7/7] migration: skip free pages during live migration
  2016-06-13 10:16 ` [Qemu-devel] " Liang Li
@ 2016-06-13 10:16   ` Liang Li
  -1 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

After sending out the request for free pages, live migration
process will start without waiting for the free page bitmap is
ready. If the free page bitmap is not ready when doing the 1st
migration_bitmap_sync() after ram_save_setup(), the free page
bitmap will be ignored, this means the free pages will not be
filtered out in this case.
The current implementation can not work with post copy, if post
copy is enabled, we simply ignore the free pages. Will make it
work later.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 844ea46..5f1c3ff 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -43,6 +43,8 @@
 #include "trace.h"
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
+#include "sysemu/balloon.h"
+#include "sysemu/kvm.h"
 
 #ifdef DEBUG_MIGRATION_RAM
 #define DPRINTF(fmt, ...) \
@@ -228,6 +230,7 @@ static QemuMutex migration_bitmap_mutex;
 static uint64_t migration_dirty_pages;
 static uint32_t last_version;
 static bool ram_bulk_stage;
+static bool ignore_freepage_rsp;
 
 /* used by the search for pages to send */
 struct PageSearchStatus {
@@ -244,6 +247,7 @@ static struct BitmapRcu {
     struct rcu_head rcu;
     /* Main migration bitmap */
     unsigned long *bmap;
+    unsigned long *free_page_bmap;
     /* bitmap of pages that haven't been sent even once
      * only maintained and used in postcopy at the moment
      * where it's used to send the dirtymap at the start
@@ -639,6 +643,7 @@ static void migration_bitmap_sync(void)
     rcu_read_unlock();
     qemu_mutex_unlock(&migration_bitmap_mutex);
 
+    ignore_freepage_rsp = true;
     trace_migration_bitmap_sync_end(migration_dirty_pages
                                     - num_dirty_pages_init);
     num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
@@ -1417,6 +1422,7 @@ static void migration_bitmap_free(struct BitmapRcu *bmap)
 {
     g_free(bmap->bmap);
     g_free(bmap->unsentmap);
+    g_free(bmap->free_page_bmap);
     g_free(bmap);
 }
 
@@ -1487,6 +1493,85 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
     }
 }
 
+static void filter_out_guest_free_page(unsigned long *free_page_bmap,
+                                       long nbits)
+{
+    long i, page_count = 0, len;
+    unsigned long *bitmap;
+
+    tighten_guest_free_page_bmap(free_page_bmap);
+    qemu_mutex_lock(&migration_bitmap_mutex);
+    bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
+    slow_bitmap_complement(bitmap, free_page_bmap, nbits);
+
+    len = (last_ram_offset() >> TARGET_PAGE_BITS) / BITS_PER_LONG;
+    for (i = 0; i < len; i++) {
+        page_count += hweight_long(bitmap[i]);
+    }
+
+    migration_dirty_pages = page_count;
+    qemu_mutex_unlock(&migration_bitmap_mutex);
+}
+
+static void ram_request_free_page(unsigned long *bmap, unsigned long max_pfn)
+{
+    BalloonReqStatus status;
+
+    status = balloon_get_free_pages(bmap, max_pfn);
+    switch (status) {
+    case REQ_DONE:
+        ignore_freepage_rsp = false;
+        break;
+    case REQ_ERROR:
+        error_report("Errro happend when request free page");
+        break;
+    default:
+        error_report("unexpected response status: %d", status);
+        break;
+    }
+}
+
+static void ram_handle_free_page(void)
+{
+    unsigned long nbits;
+    RAMBlock *pc_ram_block;
+    BalloonReqStatus status;
+
+    status = balloon_get_free_pages(migration_bitmap_rcu->free_page_bmap,
+                                    get_guest_max_pfn());
+    switch (status) {
+    case REQ_DONE:
+        rcu_read_lock();
+        pc_ram_block = QLIST_FIRST_RCU(&ram_list.blocks);
+        nbits = pc_ram_block->used_length >> TARGET_PAGE_BITS;
+        filter_out_guest_free_page(migration_bitmap_rcu->free_page_bmap, nbits);
+        rcu_read_unlock();
+
+        qemu_mutex_lock_iothread();
+        migration_bitmap_sync();
+        qemu_mutex_unlock_iothread();
+        /*
+         * bulk stage assumes in (migration_bitmap_find_and_reset_dirty) that
+         * every page is dirty, that's no longer ture at this point.
+         */
+        ram_bulk_stage = false;
+        last_seen_block = NULL;
+        last_sent_block = NULL;
+        last_offset = 0;
+        break;
+    case REQ_ERROR:
+        ignore_freepage_rsp = true;
+        error_report("failed to get free page");
+        break;
+    case REQ_INVALID_PARAM:
+        ignore_freepage_rsp = true;
+        error_report("buffer overflow");
+        break;
+    default:
+        break;
+    }
+}
+
 /*
  * 'expected' is the value you expect the bitmap mostly to be full
  * of; it won't bother printing lines that are all this value.
@@ -1950,6 +2035,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     qemu_mutex_unlock_ramlist();
     qemu_mutex_unlock_iothread();
 
+    if (balloon_free_pages_support() && !migrate_postcopy_ram()) {
+        unsigned long max_pfn = get_guest_max_pfn();
+        migration_bitmap_rcu->free_page_bmap = bitmap_new(max_pfn);
+        ram_request_free_page(migration_bitmap_rcu->free_page_bmap, max_pfn);
+    }
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
@@ -1990,6 +2080,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
     while ((ret = qemu_file_rate_limit(f)) == 0) {
         int pages;
 
+        if (!ignore_freepage_rsp) {
+            ram_handle_free_page();
+        }
         pages = ram_find_and_save_block(f, false, &bytes_transferred);
         /* no more pages to sent */
         if (pages == 0) {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [QEMU 7/7] migration: skip free pages during live migration
@ 2016-06-13 10:16   ` Liang Li
  0 siblings, 0 replies; 60+ messages in thread
From: Liang Li @ 2016-06-13 10:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert, Liang Li

After sending out the request for free pages, live migration
process will start without waiting for the free page bitmap is
ready. If the free page bitmap is not ready when doing the 1st
migration_bitmap_sync() after ram_save_setup(), the free page
bitmap will be ignored, this means the free pages will not be
filtered out in this case.
The current implementation can not work with post copy, if post
copy is enabled, we simply ignore the free pages. Will make it
work later.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 844ea46..5f1c3ff 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -43,6 +43,8 @@
 #include "trace.h"
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
+#include "sysemu/balloon.h"
+#include "sysemu/kvm.h"
 
 #ifdef DEBUG_MIGRATION_RAM
 #define DPRINTF(fmt, ...) \
@@ -228,6 +230,7 @@ static QemuMutex migration_bitmap_mutex;
 static uint64_t migration_dirty_pages;
 static uint32_t last_version;
 static bool ram_bulk_stage;
+static bool ignore_freepage_rsp;
 
 /* used by the search for pages to send */
 struct PageSearchStatus {
@@ -244,6 +247,7 @@ static struct BitmapRcu {
     struct rcu_head rcu;
     /* Main migration bitmap */
     unsigned long *bmap;
+    unsigned long *free_page_bmap;
     /* bitmap of pages that haven't been sent even once
      * only maintained and used in postcopy at the moment
      * where it's used to send the dirtymap at the start
@@ -639,6 +643,7 @@ static void migration_bitmap_sync(void)
     rcu_read_unlock();
     qemu_mutex_unlock(&migration_bitmap_mutex);
 
+    ignore_freepage_rsp = true;
     trace_migration_bitmap_sync_end(migration_dirty_pages
                                     - num_dirty_pages_init);
     num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
@@ -1417,6 +1422,7 @@ static void migration_bitmap_free(struct BitmapRcu *bmap)
 {
     g_free(bmap->bmap);
     g_free(bmap->unsentmap);
+    g_free(bmap->free_page_bmap);
     g_free(bmap);
 }
 
@@ -1487,6 +1493,85 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
     }
 }
 
+static void filter_out_guest_free_page(unsigned long *free_page_bmap,
+                                       long nbits)
+{
+    long i, page_count = 0, len;
+    unsigned long *bitmap;
+
+    tighten_guest_free_page_bmap(free_page_bmap);
+    qemu_mutex_lock(&migration_bitmap_mutex);
+    bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
+    slow_bitmap_complement(bitmap, free_page_bmap, nbits);
+
+    len = (last_ram_offset() >> TARGET_PAGE_BITS) / BITS_PER_LONG;
+    for (i = 0; i < len; i++) {
+        page_count += hweight_long(bitmap[i]);
+    }
+
+    migration_dirty_pages = page_count;
+    qemu_mutex_unlock(&migration_bitmap_mutex);
+}
+
+static void ram_request_free_page(unsigned long *bmap, unsigned long max_pfn)
+{
+    BalloonReqStatus status;
+
+    status = balloon_get_free_pages(bmap, max_pfn);
+    switch (status) {
+    case REQ_DONE:
+        ignore_freepage_rsp = false;
+        break;
+    case REQ_ERROR:
+        error_report("Errro happend when request free page");
+        break;
+    default:
+        error_report("unexpected response status: %d", status);
+        break;
+    }
+}
+
+static void ram_handle_free_page(void)
+{
+    unsigned long nbits;
+    RAMBlock *pc_ram_block;
+    BalloonReqStatus status;
+
+    status = balloon_get_free_pages(migration_bitmap_rcu->free_page_bmap,
+                                    get_guest_max_pfn());
+    switch (status) {
+    case REQ_DONE:
+        rcu_read_lock();
+        pc_ram_block = QLIST_FIRST_RCU(&ram_list.blocks);
+        nbits = pc_ram_block->used_length >> TARGET_PAGE_BITS;
+        filter_out_guest_free_page(migration_bitmap_rcu->free_page_bmap, nbits);
+        rcu_read_unlock();
+
+        qemu_mutex_lock_iothread();
+        migration_bitmap_sync();
+        qemu_mutex_unlock_iothread();
+        /*
+         * bulk stage assumes in (migration_bitmap_find_and_reset_dirty) that
+         * every page is dirty, that's no longer ture at this point.
+         */
+        ram_bulk_stage = false;
+        last_seen_block = NULL;
+        last_sent_block = NULL;
+        last_offset = 0;
+        break;
+    case REQ_ERROR:
+        ignore_freepage_rsp = true;
+        error_report("failed to get free page");
+        break;
+    case REQ_INVALID_PARAM:
+        ignore_freepage_rsp = true;
+        error_report("buffer overflow");
+        break;
+    default:
+        break;
+    }
+}
+
 /*
  * 'expected' is the value you expect the bitmap mostly to be full
  * of; it won't bother printing lines that are all this value.
@@ -1950,6 +2035,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     qemu_mutex_unlock_ramlist();
     qemu_mutex_unlock_iothread();
 
+    if (balloon_free_pages_support() && !migrate_postcopy_ram()) {
+        unsigned long max_pfn = get_guest_max_pfn();
+        migration_bitmap_rcu->free_page_bmap = bitmap_new(max_pfn);
+        ram_request_free_page(migration_bitmap_rcu->free_page_bmap, max_pfn);
+    }
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
@@ -1990,6 +2080,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
     while ((ret = qemu_file_rate_limit(f)) == 0) {
         int pages;
 
+        if (!ignore_freepage_rsp) {
+            ram_handle_free_page();
+        }
         pages = ram_find_and_save_block(f, false, &bytes_transferred);
         /* no more pages to sent */
         if (pages == 0) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 10:16   ` [Qemu-devel] " Liang Li
  (?)
@ 2016-06-13 10:50   ` Daniel P. Berrange
  2016-06-13 11:06     ` Daniel P. Berrange
                       ` (2 more replies)
  -1 siblings, 3 replies; 60+ messages in thread
From: Daniel P. Berrange @ 2016-06-13 10:50 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, quintela, mst, dgilbert, lcapitulino, amit.shah,
	pbonzini

On Mon, Jun 13, 2016 at 06:16:45PM +0800, Liang Li wrote:
> Add the hmp and qmp interface to drop vm's page cache, users
> can control the type of cache they want vm to drop.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  balloon.c        | 19 +++++++++++++++++++
>  hmp-commands.hx  | 15 +++++++++++++++
>  hmp.c            | 22 ++++++++++++++++++++++
>  hmp.h            |  3 +++
>  monitor.c        | 18 ++++++++++++++++++
>  qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++
>  qmp-commands.hx  | 23 +++++++++++++++++++++++
>  7 files changed, 135 insertions(+)

> diff --git a/qapi-schema.json b/qapi-schema.json
> index 8483bdf..117f70a 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1655,6 +1655,41 @@
>  { 'command': 'balloon', 'data': {'value': 'int'} }
>  
>  ##
> +# @DropCacheType
> +#
> +# Cache types enumeration
> +#
> +# @clean: Drop the clean page cache.
> +#
> +# @slab: Drop the slab cache.
> +#
> +# @all: Drop both the clean and the slab cache.
> +#
> +# Since: 2.7
> +##
> +{ 'enum': 'DropCacheType', 'data': ['clean', 'slab', 'all'] }

Presumably these constants are corresponding to the 3 options
for vm.drop_caches sysctl knob

[quote]
To free pagecache, use:

  echo 1 > /proc/sys/vm/drop_caches

To free dentries and inodes, use:

  echo 2 > /proc/sys/vm/drop_caches

To free pagecache, dentries and inodes, use:

  echo 3 > /proc/sys/vm/drop_caches

Because writing to this file is a nondestructive
operation and dirty objects are not freeable, the
user should run sync(1) first.
[/quote]

IOW, by 'slab' you mean dentries and inodes ?

> +
> +##
> +# @balloon_drop_cache:
> +#
> +# Request the vm to drop its cache.
> +#
> +# @value: the type of cache want vm to drop
> +#
> +# Returns: Nothing on success
> +#          If the balloon driver is enabled but not functional because the KVM
> +#            kernel module cannot support it, KvmMissingCap
> +#          If no balloon device is present, DeviceNotActive
> +#
> +# Notes: This command just issues a request to the guest.  When it returns,
> +#        the drop cache operation may not have completed.  A guest can drop its
> +#        cache independent of this command.
> +#
> +# Since: 2.7.0
> +##
> +{ 'command': 'balloon_drop_cache', 'data': {'value': 'DropCacheType'} }

Also, as noted in the man page quote above, it is recommended to call
sync() to minimise dirty pages. Should we have a way to request a sync
as part of this monitor command.

More generally, it feels like this is taking as down a path towards
actively managing the guest kernel VM from the host. Is this really
a path we want to be going down, given that its going to take us into
increasing non-portable concepts which are potentially different for
each guest OS kernel.  Is this drop caches feature at all applicable
to Windows, OS-X, *BSD guest OS impls of the balloon driver ? If it
is applicable, are the 3 fixed constants you've defined at all useful
to those other OS ?

I'm warying of us taking a design path which is so Linux specific it
isn't useful elsewhere. IOW, just because we can do this, doesn't mean
we should do this...

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 10:50   ` Daniel P. Berrange
@ 2016-06-13 11:06     ` Daniel P. Berrange
  2016-06-13 14:12         ` Li, Liang Z
  2016-06-13 11:41     ` Paolo Bonzini
  2016-06-13 13:50       ` Li, Liang Z
  2 siblings, 1 reply; 60+ messages in thread
From: Daniel P. Berrange @ 2016-06-13 11:06 UTC (permalink / raw)
  To: Liang Li
  Cc: kvm, mst, qemu-devel, quintela, dgilbert, lcapitulino, amit.shah,
	pbonzini

On Mon, Jun 13, 2016 at 11:50:08AM +0100, Daniel P. Berrange wrote:
> On Mon, Jun 13, 2016 at 06:16:45PM +0800, Liang Li wrote:
> > Add the hmp and qmp interface to drop vm's page cache, users
> > can control the type of cache they want vm to drop.
> > 
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > ---
> >  balloon.c        | 19 +++++++++++++++++++
> >  hmp-commands.hx  | 15 +++++++++++++++
> >  hmp.c            | 22 ++++++++++++++++++++++
> >  hmp.h            |  3 +++
> >  monitor.c        | 18 ++++++++++++++++++
> >  qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++
> >  qmp-commands.hx  | 23 +++++++++++++++++++++++
> >  7 files changed, 135 insertions(+)
> 
> > diff --git a/qapi-schema.json b/qapi-schema.json
> > index 8483bdf..117f70a 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -1655,6 +1655,41 @@
> >  { 'command': 'balloon', 'data': {'value': 'int'} }
> >  
> >  ##
> > +# @DropCacheType
> > +#
> > +# Cache types enumeration
> > +#
> > +# @clean: Drop the clean page cache.
> > +#
> > +# @slab: Drop the slab cache.
> > +#
> > +# @all: Drop both the clean and the slab cache.
> > +#
> > +# Since: 2.7
> > +##
> > +{ 'enum': 'DropCacheType', 'data': ['clean', 'slab', 'all'] }
> 
> Presumably these constants are corresponding to the 3 options
> for vm.drop_caches sysctl knob
> 
> [quote]
> To free pagecache, use:
> 
>   echo 1 > /proc/sys/vm/drop_caches
> 
> To free dentries and inodes, use:
> 
>   echo 2 > /proc/sys/vm/drop_caches
> 
> To free pagecache, dentries and inodes, use:
> 
>   echo 3 > /proc/sys/vm/drop_caches
> 
> Because writing to this file is a nondestructive
> operation and dirty objects are not freeable, the
> user should run sync(1) first.
> [/quote]
> 
> IOW, by 'slab' you mean dentries and inodes ?
> 
> > +
> > +##
> > +# @balloon_drop_cache:
> > +#
> > +# Request the vm to drop its cache.
> > +#
> > +# @value: the type of cache want vm to drop
> > +#
> > +# Returns: Nothing on success
> > +#          If the balloon driver is enabled but not functional because the KVM
> > +#            kernel module cannot support it, KvmMissingCap
> > +#          If no balloon device is present, DeviceNotActive
> > +#
> > +# Notes: This command just issues a request to the guest.  When it returns,
> > +#        the drop cache operation may not have completed.  A guest can drop its
> > +#        cache independent of this command.
> > +#
> > +# Since: 2.7.0
> > +##
> > +{ 'command': 'balloon_drop_cache', 'data': {'value': 'DropCacheType'} }
> 
> Also, as noted in the man page quote above, it is recommended to call
> sync() to minimise dirty pages. Should we have a way to request a sync
> as part of this monitor command.
> 
> More generally, it feels like this is taking as down a path towards
> actively managing the guest kernel VM from the host. Is this really
> a path we want to be going down, given that its going to take us into
> increasing non-portable concepts which are potentially different for
> each guest OS kernel.  Is this drop caches feature at all applicable
> to Windows, OS-X, *BSD guest OS impls of the balloon driver ? If it
> is applicable, are the 3 fixed constants you've defined at all useful
> to those other OS ?
> 
> I'm warying of us taking a design path which is so Linux specific it
> isn't useful elsewhere. IOW, just because we can do this, doesn't mean
> we should do this...

Also, I'm wondering about the overall performance benefit of dropping
guest cache(s). Increasing the amount of free memory pages may have
a benefit in terms of reducing data that needs to be migrated, but
it comes with a penalty that if the guest OS needs that data, it will
have to repopulate the caches.

If the guest is merely reading those cached pages, it isn't going to
cause any problem with chances of convergance of migration, as clean
pages will be copied only once during migration. IOW, dropping clean
pages will reduce the total memory that needs to be copied, but won't
have notable affect on convergance of live migration. Cache pages that
are dirty will potentially affect live migration convergance, if the
guest OS re-dirties the pages before they're flushed to storage. Dropping
caches won't help in this respect though, since you can't drop dirty
pages. At the same time it will have a potentially significant negative
penalty on guest OS performance by forcing the guest to re-populate the
cache from slow underlying storage.  I don't think there's enough info
exposed by KVM about the guest OS to be able to figure out what kind of
situation we're in wrt the guest OS cache usage.

Based on this I think it is hard to see how a host mgmt app can make a
well informed decision about whether telling the guest OS to drop caches
is a positive thing overall. In fact I think most likely is that a mgmt
app would take a pessimistic view and not use this functionality, because
there's no clearly positive impact on migration convergance and high
liklihood of negatively impacting guest performance.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 10:50   ` Daniel P. Berrange
  2016-06-13 11:06     ` Daniel P. Berrange
@ 2016-06-13 11:41     ` Paolo Bonzini
  2016-06-13 14:14         ` Li, Liang Z
  2016-06-13 13:50       ` Li, Liang Z
  2 siblings, 1 reply; 60+ messages in thread
From: Paolo Bonzini @ 2016-06-13 11:41 UTC (permalink / raw)
  To: Daniel P. Berrange, Liang Li
  Cc: qemu-devel, kvm, quintela, mst, dgilbert, lcapitulino, amit.shah



On 13/06/2016 12:50, Daniel P. Berrange wrote:
> More generally, it feels like this is taking as down a path towards
> actively managing the guest kernel VM from the host. Is this really
> a path we want to be going down, given that its going to take us into
> increasing non-portable concepts which are potentially different for
> each guest OS kernel.  Is this drop caches feature at all applicable
> to Windows, OS-X, *BSD guest OS impls of the balloon driver ? If it
> is applicable, are the 3 fixed constants you've defined at all useful
> to those other OS ?
> 
> I'm warying of us taking a design path which is so Linux specific it
> isn't useful elsewhere. IOW, just because we can do this, doesn't mean
> we should do this...

I agree.  And if anything, this should be handled through the guest agent.

Paolo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 10:50   ` Daniel P. Berrange
@ 2016-06-13 13:50       ` Li, Liang Z
  2016-06-13 11:41     ` Paolo Bonzini
  2016-06-13 13:50       ` Li, Liang Z
  2 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-13 13:50 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: qemu-devel, kvm, quintela, mst, dgilbert, lcapitulino, amit.shah,
	pbonzini

> Because writing to this file is a nondestructive operation and dirty objects are
> not freeable, the user should run sync(1) first.
> [/quote]
> 
> IOW, by 'slab' you mean dentries and inodes ?
> 
Yes.

> > +##
> > +{ 'command': 'balloon_drop_cache', 'data': {'value': 'DropCacheType'}
> > +}
> 
> Also, as noted in the man page quote above, it is recommended to call
> sync() to minimise dirty pages. Should we have a way to request a sync as
> part of this monitor command.
> 
> More generally, it feels like this is taking as down a path towards actively
> managing the guest kernel VM from the host. Is this really a path we want to
> be going down, given that its going to take us into increasing non-portable
> concepts which are potentially different for each guest OS kernel.  Is this
> drop caches feature at all applicable to Windows, OS-X, *BSD guest OS impls
> of the balloon driver ? If it is applicable, are the 3 fixed constants you've

No. 

> defined at all useful to those other OS ?
> 

Maybe they are not.  
I agree that there are too Linux specific.  And I did more than needed.
Actually, I just want to drop the clean cache, do more than that is too heavy
and no good for performance.

> I'm warying of us taking a design path which is so Linux specific it isn't useful
> elsewhere. IOW, just because we can do this, doesn't mean we should do
> this...
> 

Agree.

Thanks!

Liang
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
@ 2016-06-13 13:50       ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-13 13:50 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: qemu-devel, kvm, quintela, mst, dgilbert, lcapitulino, amit.shah,
	pbonzini

> Because writing to this file is a nondestructive operation and dirty objects are
> not freeable, the user should run sync(1) first.
> [/quote]
> 
> IOW, by 'slab' you mean dentries and inodes ?
> 
Yes.

> > +##
> > +{ 'command': 'balloon_drop_cache', 'data': {'value': 'DropCacheType'}
> > +}
> 
> Also, as noted in the man page quote above, it is recommended to call
> sync() to minimise dirty pages. Should we have a way to request a sync as
> part of this monitor command.
> 
> More generally, it feels like this is taking as down a path towards actively
> managing the guest kernel VM from the host. Is this really a path we want to
> be going down, given that its going to take us into increasing non-portable
> concepts which are potentially different for each guest OS kernel.  Is this
> drop caches feature at all applicable to Windows, OS-X, *BSD guest OS impls
> of the balloon driver ? If it is applicable, are the 3 fixed constants you've

No. 

> defined at all useful to those other OS ?
> 

Maybe they are not.  
I agree that there are too Linux specific.  And I did more than needed.
Actually, I just want to drop the clean cache, do more than that is too heavy
and no good for performance.

> I'm warying of us taking a design path which is so Linux specific it isn't useful
> elsewhere. IOW, just because we can do this, doesn't mean we should do
> this...
> 

Agree.

Thanks!

Liang
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 11:06     ` Daniel P. Berrange
@ 2016-06-13 14:12         ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-13 14:12 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: kvm, mst, qemu-devel, quintela, dgilbert, lcapitulino, amit.shah,
	pbonzini

> On Mon, Jun 13, 2016 at 11:50:08AM +0100, Daniel P. Berrange wrote:
> > On Mon, Jun 13, 2016 at 06:16:45PM +0800, Liang Li wrote:
> > > Add the hmp and qmp interface to drop vm's page cache, users can
> > > control the type of cache they want vm to drop.
> > >
> > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > ---
> > >  balloon.c        | 19 +++++++++++++++++++
> > >  hmp-commands.hx  | 15 +++++++++++++++
> > >  hmp.c            | 22 ++++++++++++++++++++++
> > >  hmp.h            |  3 +++
> > >  monitor.c        | 18 ++++++++++++++++++
> > >  qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++
> > >  qmp-commands.hx  | 23 +++++++++++++++++++++++
> > >  7 files changed, 135 insertions(+)
> >
> > > diff --git a/qapi-schema.json b/qapi-schema.json index
> > > 8483bdf..117f70a 100644
> > > --- a/qapi-schema.json
> > > +++ b/qapi-schema.json
> > > @@ -1655,6 +1655,41 @@
> > >  { 'command': 'balloon', 'data': {'value': 'int'} }
> > >
> > >  ##
> > > +# @DropCacheType
> > > +#
> > > +# Cache types enumeration
> > > +#
> > > +# @clean: Drop the clean page cache.
> > > +#
> > > +# @slab: Drop the slab cache.
> > > +#
> > > +# @all: Drop both the clean and the slab cache.
> > > +#
> > > +# Since: 2.7
> > > +##
> > > +{ 'enum': 'DropCacheType', 'data': ['clean', 'slab', 'all'] }
> >
> > Presumably these constants are corresponding to the 3 options for
> > vm.drop_caches sysctl knob
> >
> > [quote]
> > To free pagecache, use:
> >
> >   echo 1 > /proc/sys/vm/drop_caches
> >
> > To free dentries and inodes, use:
> >
> >   echo 2 > /proc/sys/vm/drop_caches
> >
> > To free pagecache, dentries and inodes, use:
> >
> >   echo 3 > /proc/sys/vm/drop_caches
> >
> > Because writing to this file is a nondestructive operation and dirty
> > objects are not freeable, the user should run sync(1) first.
> > [/quote]
> >
> > IOW, by 'slab' you mean dentries and inodes ?
> >
> > > +
> > > +##
> > > +# @balloon_drop_cache:
> > > +#
> > > +# Request the vm to drop its cache.
> > > +#
> > > +# @value: the type of cache want vm to drop # # Returns: Nothing on
> > > +success
> > > +#          If the balloon driver is enabled but not functional because the
> KVM
> > > +#            kernel module cannot support it, KvmMissingCap
> > > +#          If no balloon device is present, DeviceNotActive
> > > +#
> > > +# Notes: This command just issues a request to the guest.  When it
> returns,
> > > +#        the drop cache operation may not have completed.  A guest can
> drop its
> > > +#        cache independent of this command.
> > > +#
> > > +# Since: 2.7.0
> > > +##
> > > +{ 'command': 'balloon_drop_cache', 'data': {'value':
> > > +'DropCacheType'} }
> >
> > Also, as noted in the man page quote above, it is recommended to call
> > sync() to minimise dirty pages. Should we have a way to request a sync
> > as part of this monitor command.
> >
> > More generally, it feels like this is taking as down a path towards
> > actively managing the guest kernel VM from the host. Is this really a
> > path we want to be going down, given that its going to take us into
> > increasing non-portable concepts which are potentially different for
> > each guest OS kernel.  Is this drop caches feature at all applicable
> > to Windows, OS-X, *BSD guest OS impls of the balloon driver ? If it is
> > applicable, are the 3 fixed constants you've defined at all useful to
> > those other OS ?
> >
> > I'm warying of us taking a design path which is so Linux specific it
> > isn't useful elsewhere. IOW, just because we can do this, doesn't mean
> > we should do this...
> 
> Also, I'm wondering about the overall performance benefit of dropping guest
> cache(s). Increasing the amount of free memory pages may have a benefit in
> terms of reducing data that needs to be migrated, but it comes with a
> penalty that if the guest OS needs that data, it will have to repopulate the
> caches.
> 
> If the guest is merely reading those cached pages, it isn't going to cause any
> problem with chances of convergance of migration, as clean pages will be
> copied only once during migration. IOW, dropping clean pages will reduce the
> total memory that needs to be copied, but won't have notable affect on
> convergance of live migration. Cache pages that are dirty will potentially
> affect live migration convergance, if the guest OS re-dirties the pages before
> they're flushed to storage. Dropping caches won't help in this respect though,
> since you can't drop dirty pages. At the same time it will have a potentially
> significant negative penalty on guest OS performance by forcing the guest to
> re-populate the cache from slow underlying storage.  I don't think there's
> enough info exposed by KVM about the guest OS to be able to figure out
> what kind of situation we're in wrt the guest OS cache usage.
> 
> Based on this I think it is hard to see how a host mgmt app can make a well
> informed decision about whether telling the guest OS to drop caches is a
> positive thing overall. In fact I think most likely is that a mgmt app would take
> a pessimistic view and not use this functionality, because there's no clearly
> positive impact on migration convergance and high liklihood of negatively
> impacting guest performance.
> 
> Regards,
> Daniel

Thanks for your detailed analyzation.
I did some test and found that drop the clean cache can speed up live migration, 
and drop the dirty page cache can make it slower.
The reason I added more options than just the clean cache is for 'integrate', and
it's too  Linux specific. 

How about just dropping the clean page cache? is it still too Linux specific?

Liang

> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
@ 2016-06-13 14:12         ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-13 14:12 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: kvm, mst, qemu-devel, quintela, dgilbert, lcapitulino, amit.shah,
	pbonzini

> On Mon, Jun 13, 2016 at 11:50:08AM +0100, Daniel P. Berrange wrote:
> > On Mon, Jun 13, 2016 at 06:16:45PM +0800, Liang Li wrote:
> > > Add the hmp and qmp interface to drop vm's page cache, users can
> > > control the type of cache they want vm to drop.
> > >
> > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > ---
> > >  balloon.c        | 19 +++++++++++++++++++
> > >  hmp-commands.hx  | 15 +++++++++++++++
> > >  hmp.c            | 22 ++++++++++++++++++++++
> > >  hmp.h            |  3 +++
> > >  monitor.c        | 18 ++++++++++++++++++
> > >  qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++
> > >  qmp-commands.hx  | 23 +++++++++++++++++++++++
> > >  7 files changed, 135 insertions(+)
> >
> > > diff --git a/qapi-schema.json b/qapi-schema.json index
> > > 8483bdf..117f70a 100644
> > > --- a/qapi-schema.json
> > > +++ b/qapi-schema.json
> > > @@ -1655,6 +1655,41 @@
> > >  { 'command': 'balloon', 'data': {'value': 'int'} }
> > >
> > >  ##
> > > +# @DropCacheType
> > > +#
> > > +# Cache types enumeration
> > > +#
> > > +# @clean: Drop the clean page cache.
> > > +#
> > > +# @slab: Drop the slab cache.
> > > +#
> > > +# @all: Drop both the clean and the slab cache.
> > > +#
> > > +# Since: 2.7
> > > +##
> > > +{ 'enum': 'DropCacheType', 'data': ['clean', 'slab', 'all'] }
> >
> > Presumably these constants are corresponding to the 3 options for
> > vm.drop_caches sysctl knob
> >
> > [quote]
> > To free pagecache, use:
> >
> >   echo 1 > /proc/sys/vm/drop_caches
> >
> > To free dentries and inodes, use:
> >
> >   echo 2 > /proc/sys/vm/drop_caches
> >
> > To free pagecache, dentries and inodes, use:
> >
> >   echo 3 > /proc/sys/vm/drop_caches
> >
> > Because writing to this file is a nondestructive operation and dirty
> > objects are not freeable, the user should run sync(1) first.
> > [/quote]
> >
> > IOW, by 'slab' you mean dentries and inodes ?
> >
> > > +
> > > +##
> > > +# @balloon_drop_cache:
> > > +#
> > > +# Request the vm to drop its cache.
> > > +#
> > > +# @value: the type of cache want vm to drop # # Returns: Nothing on
> > > +success
> > > +#          If the balloon driver is enabled but not functional because the
> KVM
> > > +#            kernel module cannot support it, KvmMissingCap
> > > +#          If no balloon device is present, DeviceNotActive
> > > +#
> > > +# Notes: This command just issues a request to the guest.  When it
> returns,
> > > +#        the drop cache operation may not have completed.  A guest can
> drop its
> > > +#        cache independent of this command.
> > > +#
> > > +# Since: 2.7.0
> > > +##
> > > +{ 'command': 'balloon_drop_cache', 'data': {'value':
> > > +'DropCacheType'} }
> >
> > Also, as noted in the man page quote above, it is recommended to call
> > sync() to minimise dirty pages. Should we have a way to request a sync
> > as part of this monitor command.
> >
> > More generally, it feels like this is taking as down a path towards
> > actively managing the guest kernel VM from the host. Is this really a
> > path we want to be going down, given that its going to take us into
> > increasing non-portable concepts which are potentially different for
> > each guest OS kernel.  Is this drop caches feature at all applicable
> > to Windows, OS-X, *BSD guest OS impls of the balloon driver ? If it is
> > applicable, are the 3 fixed constants you've defined at all useful to
> > those other OS ?
> >
> > I'm warying of us taking a design path which is so Linux specific it
> > isn't useful elsewhere. IOW, just because we can do this, doesn't mean
> > we should do this...
> 
> Also, I'm wondering about the overall performance benefit of dropping guest
> cache(s). Increasing the amount of free memory pages may have a benefit in
> terms of reducing data that needs to be migrated, but it comes with a
> penalty that if the guest OS needs that data, it will have to repopulate the
> caches.
> 
> If the guest is merely reading those cached pages, it isn't going to cause any
> problem with chances of convergance of migration, as clean pages will be
> copied only once during migration. IOW, dropping clean pages will reduce the
> total memory that needs to be copied, but won't have notable affect on
> convergance of live migration. Cache pages that are dirty will potentially
> affect live migration convergance, if the guest OS re-dirties the pages before
> they're flushed to storage. Dropping caches won't help in this respect though,
> since you can't drop dirty pages. At the same time it will have a potentially
> significant negative penalty on guest OS performance by forcing the guest to
> re-populate the cache from slow underlying storage.  I don't think there's
> enough info exposed by KVM about the guest OS to be able to figure out
> what kind of situation we're in wrt the guest OS cache usage.
> 
> Based on this I think it is hard to see how a host mgmt app can make a well
> informed decision about whether telling the guest OS to drop caches is a
> positive thing overall. In fact I think most likely is that a mgmt app would take
> a pessimistic view and not use this functionality, because there's no clearly
> positive impact on migration convergance and high liklihood of negatively
> impacting guest performance.
> 
> Regards,
> Daniel

Thanks for your detailed analyzation.
I did some test and found that drop the clean cache can speed up live migration, 
and drop the dirty page cache can make it slower.
The reason I added more options than just the clean cache is for 'integrate', and
it's too  Linux specific. 

How about just dropping the clean page cache? is it still too Linux specific?

Liang

> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 11:41     ` Paolo Bonzini
@ 2016-06-13 14:14         ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-13 14:14 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P. Berrange
  Cc: qemu-devel, kvm, quintela, mst, dgilbert, lcapitulino, amit.shah

> 
> On 13/06/2016 12:50, Daniel P. Berrange wrote:
> > More generally, it feels like this is taking as down a path towards
> > actively managing the guest kernel VM from the host. Is this really a
> > path we want to be going down, given that its going to take us into
> > increasing non-portable concepts which are potentially different for
> > each guest OS kernel.  Is this drop caches feature at all applicable
> > to Windows, OS-X, *BSD guest OS impls of the balloon driver ? If it is
> > applicable, are the 3 fixed constants you've defined at all useful to
> > those other OS ?
> >
> > I'm warying of us taking a design path which is so Linux specific it
> > isn't useful elsewhere. IOW, just because we can do this, doesn't mean
> > we should do this...
> 
> I agree.  And if anything, this should be handled through the guest agent.
> 
> Paolo

Guest agent is a good choice. Thanks!

Liang
	


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
@ 2016-06-13 14:14         ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-13 14:14 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P. Berrange
  Cc: qemu-devel, kvm, quintela, mst, dgilbert, lcapitulino, amit.shah

> 
> On 13/06/2016 12:50, Daniel P. Berrange wrote:
> > More generally, it feels like this is taking as down a path towards
> > actively managing the guest kernel VM from the host. Is this really a
> > path we want to be going down, given that its going to take us into
> > increasing non-portable concepts which are potentially different for
> > each guest OS kernel.  Is this drop caches feature at all applicable
> > to Windows, OS-X, *BSD guest OS impls of the balloon driver ? If it is
> > applicable, are the 3 fixed constants you've defined at all useful to
> > those other OS ?
> >
> > I'm warying of us taking a design path which is so Linux specific it
> > isn't useful elsewhere. IOW, just because we can do this, doesn't mean
> > we should do this...
> 
> I agree.  And if anything, this should be handled through the guest agent.
> 
> Paolo

Guest agent is a good choice. Thanks!

Liang
	


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 13:50       ` Li, Liang Z
  (?)
@ 2016-06-13 15:09       ` Dr. David Alan Gilbert
  2016-06-14  1:15           ` Li, Liang Z
  2016-06-17  1:35           ` Li, Liang Z
  -1 siblings, 2 replies; 60+ messages in thread
From: Dr. David Alan Gilbert @ 2016-06-13 15:09 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: Daniel P. Berrange, qemu-devel, kvm, quintela, mst, lcapitulino,
	amit.shah, pbonzini

* Li, Liang Z (liang.z.li@intel.com) wrote:
> > Because writing to this file is a nondestructive operation and dirty objects are
> > not freeable, the user should run sync(1) first.
> > [/quote]
> > 
> > IOW, by 'slab' you mean dentries and inodes ?
> > 
> Yes.
> 
> > > +##
> > > +{ 'command': 'balloon_drop_cache', 'data': {'value': 'DropCacheType'}
> > > +}
> > 
> > Also, as noted in the man page quote above, it is recommended to call
> > sync() to minimise dirty pages. Should we have a way to request a sync as
> > part of this monitor command.
> > 
> > More generally, it feels like this is taking as down a path towards actively
> > managing the guest kernel VM from the host. Is this really a path we want to
> > be going down, given that its going to take us into increasing non-portable
> > concepts which are potentially different for each guest OS kernel.  Is this
> > drop caches feature at all applicable to Windows, OS-X, *BSD guest OS impls
> > of the balloon driver ? If it is applicable, are the 3 fixed constants you've
> 
> No. 
> 
> > defined at all useful to those other OS ?
> > 
> 
> Maybe they are not.  
> I agree that there are too Linux specific.  And I did more than needed.
> Actually, I just want to drop the clean cache, do more than that is too heavy
> and no good for performance.
> 
> > I'm warying of us taking a design path which is so Linux specific it isn't useful
> > elsewhere. IOW, just because we can do this, doesn't mean we should do
> > this...
> > 
> 
> Agree.

I can see an argument for giving the guest a hint about what's going on and letting
the guest decide what it's going to do - so telling the guest that a migration
is happening and you'd like it to make the hosts life easy seems reasonable
and it doesn't make any guest OS assumptions.

Dave

> 
> Thanks!
> 
> Liang
> > Regards,
> > Daniel
> > --
> > |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> > |: http://libvirt.org              -o-             http://virt-manager.org :|
> > |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> > |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 15:09       ` Dr. David Alan Gilbert
@ 2016-06-14  1:15           ` Li, Liang Z
  2016-06-17  1:35           ` Li, Liang Z
  1 sibling, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-14  1:15 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrange, qemu-devel, kvm, quintela, mst, lcapitulino,
	amit.shah, pbonzini

> * Li, Liang Z (liang.z.li@intel.com) wrote:
> > > Because writing to this file is a nondestructive operation and dirty
> > > objects are not freeable, the user should run sync(1) first.
> > > [/quote]
> > >
> > > IOW, by 'slab' you mean dentries and inodes ?
> > >
> > Yes.
> >
> > > > +##
> > > > +{ 'command': 'balloon_drop_cache', 'data': {'value':
> > > > +'DropCacheType'} }
> > >
> > > Also, as noted in the man page quote above, it is recommended to
> > > call
> > > sync() to minimise dirty pages. Should we have a way to request a
> > > sync as part of this monitor command.
> > >
> > > More generally, it feels like this is taking as down a path towards
> > > actively managing the guest kernel VM from the host. Is this really
> > > a path we want to be going down, given that its going to take us
> > > into increasing non-portable concepts which are potentially
> > > different for each guest OS kernel.  Is this drop caches feature at
> > > all applicable to Windows, OS-X, *BSD guest OS impls of the balloon
> > > driver ? If it is applicable, are the 3 fixed constants you've
> >
> > No.
> >
> > > defined at all useful to those other OS ?
> > >
> >
> > Maybe they are not.
> > I agree that there are too Linux specific.  And I did more than needed.
> > Actually, I just want to drop the clean cache, do more than that is
> > too heavy and no good for performance.
> >
> > > I'm warying of us taking a design path which is so Linux specific it
> > > isn't useful elsewhere. IOW, just because we can do this, doesn't
> > > mean we should do this...
> > >
> >
> > Agree.
> 
> I can see an argument for giving the guest a hint about what's going on and
> letting the guest decide what it's going to do - so telling the guest that a
> migration is happening and you'd like it to make the hosts life easy seems
> reasonable and it doesn't make any guest OS assumptions.
> 
> Dave
> 

It seems the way I used in the previous patches is more acceptable.

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
@ 2016-06-14  1:15           ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-14  1:15 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrange, qemu-devel, kvm, quintela, mst, lcapitulino,
	amit.shah, pbonzini

> * Li, Liang Z (liang.z.li@intel.com) wrote:
> > > Because writing to this file is a nondestructive operation and dirty
> > > objects are not freeable, the user should run sync(1) first.
> > > [/quote]
> > >
> > > IOW, by 'slab' you mean dentries and inodes ?
> > >
> > Yes.
> >
> > > > +##
> > > > +{ 'command': 'balloon_drop_cache', 'data': {'value':
> > > > +'DropCacheType'} }
> > >
> > > Also, as noted in the man page quote above, it is recommended to
> > > call
> > > sync() to minimise dirty pages. Should we have a way to request a
> > > sync as part of this monitor command.
> > >
> > > More generally, it feels like this is taking as down a path towards
> > > actively managing the guest kernel VM from the host. Is this really
> > > a path we want to be going down, given that its going to take us
> > > into increasing non-portable concepts which are potentially
> > > different for each guest OS kernel.  Is this drop caches feature at
> > > all applicable to Windows, OS-X, *BSD guest OS impls of the balloon
> > > driver ? If it is applicable, are the 3 fixed constants you've
> >
> > No.
> >
> > > defined at all useful to those other OS ?
> > >
> >
> > Maybe they are not.
> > I agree that there are too Linux specific.  And I did more than needed.
> > Actually, I just want to drop the clean cache, do more than that is
> > too heavy and no good for performance.
> >
> > > I'm warying of us taking a design path which is so Linux specific it
> > > isn't useful elsewhere. IOW, just because we can do this, doesn't
> > > mean we should do this...
> > >
> >
> > Agree.
> 
> I can see an argument for giving the guest a hint about what's going on and
> letting the guest decide what it's going to do - so telling the guest that a
> migration is happening and you'd like it to make the hosts life easy seems
> reasonable and it doesn't make any guest OS assumptions.
> 
> Dave
> 

It seems the way I used in the previous patches is more acceptable.

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-13 10:16   ` [Qemu-devel] " Liang Li
@ 2016-06-14 11:37     ` Thomas Huth
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Huth @ 2016-06-14 11:37 UTC (permalink / raw)
  To: Liang Li, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On 13.06.2016 12:16, Liang Li wrote:
> The implementation of the current virtio-balloon is not very efficient,
> Bellow is test result of time spends on inflating the balloon to 3GB of
> a 4GB idle guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete. The
> test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can
> reduce the overhead spends on stage b quite a lot. Furthermore, it's
> possible to do the address translation and do the madvise with a bulk
> of pages, instead of the current page per page way, so the overhead of
> stage c and stage d can also be reduced a lot.
> 
> This patch is the QEMU side implementation which is intended to speed
> up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> idle guest only takes 210ms, it's about 8 times as fast as before.
> 
> TODO: optimize stage a by allocating/freeing a chunk of pages instead
> of a single page at a time.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
>  include/standard-headers/linux/virtio_balloon.h |   1 +
>  2 files changed, 139 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8c15e09..8cf74c2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
>  #endif
>  }
>  
> +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> +                                  unsigned long len, bool deflate)
> +{
> +    ram_addr_t size, processed, chunk, base;
> +    void *addr;
> +    MemoryRegionSection section = {.mr = NULL};
> +
> +    size = (len << page_shift);
> +    base = (base_pfn << page_shift);
> +
> +    for (processed = 0; processed < size; processed += chunk) {
> +        chunk = size - processed;
> +        while (chunk >= TARGET_PAGE_SIZE) {
> +            section = memory_region_find(get_system_memory(),
> +                                         base + processed, chunk);
> +            if (!section.mr) {
> +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> +            } else {
> +                break;
> +            }
> +        }
> +
> +        if (section.mr &&
> +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> +            addr = section.offset_within_region +
> +                   memory_region_get_ram_ptr(section.mr);
> +            qemu_madvise(addr, chunk,
> +                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +        } else {
> +            fprintf(stderr, "can't find the chunk, skip\n");

Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
Use error_report(...) or in this case maybe rather
qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
reasonable error message (e.g. that it is clear that the error happened
in the balloon code).

> +            chunk = TARGET_PAGE_SIZE;
> +        }
> +    }
> +}
> +
> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
> +                               unsigned long len, int page_shift, bool deflate)
> +{
> +#if defined(__linux__)

Why do you need this #if here?

> +    unsigned long end  = len * 8;
> +    unsigned long current = 0;
> +
> +    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> +                                         kvm_has_sync_mmu())) {
> +        while (current < end) {
> +            unsigned long one = find_next_bit(bitmap, end, current);
> +
> +            if (one < end) {
> +                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
> +                unsigned long page_length;
> +
> +                if (zero >= end) {
> +                    page_length = end - one;
> +                } else {
> +                    page_length = zero - one;
> +                }
> +
> +                if (page_length) {
> +                    do_balloon_bulk_pages(base_pfn + one, page_shift,
> +                                          page_length, deflate);
> +                }
> +                current = one + page_length;
> +            } else {
> +                current = one;
> +            }
> +        }
> +    }
> +#endif
> +}

 Thomas


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-14 11:37     ` Thomas Huth
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Huth @ 2016-06-14 11:37 UTC (permalink / raw)
  To: Liang Li, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On 13.06.2016 12:16, Liang Li wrote:
> The implementation of the current virtio-balloon is not very efficient,
> Bellow is test result of time spends on inflating the balloon to 3GB of
> a 4GB idle guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete. The
> test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can
> reduce the overhead spends on stage b quite a lot. Furthermore, it's
> possible to do the address translation and do the madvise with a bulk
> of pages, instead of the current page per page way, so the overhead of
> stage c and stage d can also be reduced a lot.
> 
> This patch is the QEMU side implementation which is intended to speed
> up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> idle guest only takes 210ms, it's about 8 times as fast as before.
> 
> TODO: optimize stage a by allocating/freeing a chunk of pages instead
> of a single page at a time.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
>  include/standard-headers/linux/virtio_balloon.h |   1 +
>  2 files changed, 139 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8c15e09..8cf74c2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
>  #endif
>  }
>  
> +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> +                                  unsigned long len, bool deflate)
> +{
> +    ram_addr_t size, processed, chunk, base;
> +    void *addr;
> +    MemoryRegionSection section = {.mr = NULL};
> +
> +    size = (len << page_shift);
> +    base = (base_pfn << page_shift);
> +
> +    for (processed = 0; processed < size; processed += chunk) {
> +        chunk = size - processed;
> +        while (chunk >= TARGET_PAGE_SIZE) {
> +            section = memory_region_find(get_system_memory(),
> +                                         base + processed, chunk);
> +            if (!section.mr) {
> +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> +            } else {
> +                break;
> +            }
> +        }
> +
> +        if (section.mr &&
> +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> +            addr = section.offset_within_region +
> +                   memory_region_get_ram_ptr(section.mr);
> +            qemu_madvise(addr, chunk,
> +                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +        } else {
> +            fprintf(stderr, "can't find the chunk, skip\n");

Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
Use error_report(...) or in this case maybe rather
qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
reasonable error message (e.g. that it is clear that the error happened
in the balloon code).

> +            chunk = TARGET_PAGE_SIZE;
> +        }
> +    }
> +}
> +
> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
> +                               unsigned long len, int page_shift, bool deflate)
> +{
> +#if defined(__linux__)

Why do you need this #if here?

> +    unsigned long end  = len * 8;
> +    unsigned long current = 0;
> +
> +    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> +                                         kvm_has_sync_mmu())) {
> +        while (current < end) {
> +            unsigned long one = find_next_bit(bitmap, end, current);
> +
> +            if (one < end) {
> +                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
> +                unsigned long page_length;
> +
> +                if (zero >= end) {
> +                    page_length = end - one;
> +                } else {
> +                    page_length = zero - one;
> +                }
> +
> +                if (page_length) {
> +                    do_balloon_bulk_pages(base_pfn + one, page_shift,
> +                                          page_length, deflate);
> +                }
> +                current = one + page_length;
> +            } else {
> +                current = one;
> +            }
> +        }
> +    }
> +#endif
> +}

 Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-14 11:37     ` [Qemu-devel] " Thomas Huth
@ 2016-06-14 14:22       ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-14 14:22 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> Subject: Re: [QEMU 1/7] balloon: speed up inflating & deflating process
> 
> On 13.06.2016 12:16, Liang Li wrote:
> > The implementation of the current virtio-balloon is not very
> > efficient, Bellow is test result of time spends on inflating the
> > balloon to 3GB of a 4GB idle guest:
> >
> > a. allocating pages (6.5%, 103ms)
> > b. sending PFNs to host (68.3%, 787ms) c. address translation (6.1%,
> > 96ms) d. madvise (19%, 300ms)
> >
> > It takes about 1577ms for the whole inflating process to complete. The
> > test shows that the bottle neck is the stage b and stage d.
> >
> > If using a bitmap to send the page info instead of the PFNs, we can
> > reduce the overhead spends on stage b quite a lot. Furthermore, it's
> > possible to do the address translation and do the madvise with a bulk
> > of pages, instead of the current page per page way, so the overhead of
> > stage c and stage d can also be reduced a lot.
> >
> > This patch is the QEMU side implementation which is intended to speed
> > up the inflating & deflating process by adding a new feature to the
> > virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> > idle guest only takes 210ms, it's about 8 times as fast as before.
> >
> > TODO: optimize stage a by allocating/freeing a chunk of pages instead
> > of a single page at a time.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > ---
> >  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
> >  include/standard-headers/linux/virtio_balloon.h |   1 +
> >  2 files changed, 139 insertions(+), 21 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 8c15e09..8cf74c2 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> > @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
> > #endif  }
> >
> > +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> > +                                  unsigned long len, bool deflate) {
> > +    ram_addr_t size, processed, chunk, base;
> > +    void *addr;
> > +    MemoryRegionSection section = {.mr = NULL};
> > +
> > +    size = (len << page_shift);
> > +    base = (base_pfn << page_shift);
> > +
> > +    for (processed = 0; processed < size; processed += chunk) {
> > +        chunk = size - processed;
> > +        while (chunk >= TARGET_PAGE_SIZE) {
> > +            section = memory_region_find(get_system_memory(),
> > +                                         base + processed, chunk);
> > +            if (!section.mr) {
> > +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> > +            } else {
> > +                break;
> > +            }
> > +        }
> > +
> > +        if (section.mr &&
> > +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> > +            addr = section.offset_within_region +
> > +                   memory_region_get_ram_ptr(section.mr);
> > +            qemu_madvise(addr, chunk,
> > +                         deflate ? QEMU_MADV_WILLNEED :
> QEMU_MADV_DONTNEED);
> > +        } else {
> > +            fprintf(stderr, "can't find the chunk, skip\n");
> 
> Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
> Use error_report(...) or in this case maybe rather
> qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
> reasonable error message (e.g. that it is clear that the error happened in the
> balloon code).
> 

Indeed, the error message is no good, will change in next version.

> > +            chunk = TARGET_PAGE_SIZE;
> > +        }
> > +    }
> > +}
> > +
> > +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
> *bitmap,
> > +                               unsigned long len, int page_shift,
> > +bool deflate) { #if defined(__linux__)
> 
> Why do you need this #if here?
> 

Ooh,  it is wrong to add the '#if' here, will remove.

Thanks a lot!

Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-14 14:22       ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-14 14:22 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> Subject: Re: [QEMU 1/7] balloon: speed up inflating & deflating process
> 
> On 13.06.2016 12:16, Liang Li wrote:
> > The implementation of the current virtio-balloon is not very
> > efficient, Bellow is test result of time spends on inflating the
> > balloon to 3GB of a 4GB idle guest:
> >
> > a. allocating pages (6.5%, 103ms)
> > b. sending PFNs to host (68.3%, 787ms) c. address translation (6.1%,
> > 96ms) d. madvise (19%, 300ms)
> >
> > It takes about 1577ms for the whole inflating process to complete. The
> > test shows that the bottle neck is the stage b and stage d.
> >
> > If using a bitmap to send the page info instead of the PFNs, we can
> > reduce the overhead spends on stage b quite a lot. Furthermore, it's
> > possible to do the address translation and do the madvise with a bulk
> > of pages, instead of the current page per page way, so the overhead of
> > stage c and stage d can also be reduced a lot.
> >
> > This patch is the QEMU side implementation which is intended to speed
> > up the inflating & deflating process by adding a new feature to the
> > virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> > idle guest only takes 210ms, it's about 8 times as fast as before.
> >
> > TODO: optimize stage a by allocating/freeing a chunk of pages instead
> > of a single page at a time.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > ---
> >  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
> >  include/standard-headers/linux/virtio_balloon.h |   1 +
> >  2 files changed, 139 insertions(+), 21 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 8c15e09..8cf74c2 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> > @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
> > #endif  }
> >
> > +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> > +                                  unsigned long len, bool deflate) {
> > +    ram_addr_t size, processed, chunk, base;
> > +    void *addr;
> > +    MemoryRegionSection section = {.mr = NULL};
> > +
> > +    size = (len << page_shift);
> > +    base = (base_pfn << page_shift);
> > +
> > +    for (processed = 0; processed < size; processed += chunk) {
> > +        chunk = size - processed;
> > +        while (chunk >= TARGET_PAGE_SIZE) {
> > +            section = memory_region_find(get_system_memory(),
> > +                                         base + processed, chunk);
> > +            if (!section.mr) {
> > +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> > +            } else {
> > +                break;
> > +            }
> > +        }
> > +
> > +        if (section.mr &&
> > +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> > +            addr = section.offset_within_region +
> > +                   memory_region_get_ram_ptr(section.mr);
> > +            qemu_madvise(addr, chunk,
> > +                         deflate ? QEMU_MADV_WILLNEED :
> QEMU_MADV_DONTNEED);
> > +        } else {
> > +            fprintf(stderr, "can't find the chunk, skip\n");
> 
> Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
> Use error_report(...) or in this case maybe rather
> qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
> reasonable error message (e.g. that it is clear that the error happened in the
> balloon code).
> 

Indeed, the error message is no good, will change in next version.

> > +            chunk = TARGET_PAGE_SIZE;
> > +        }
> > +    }
> > +}
> > +
> > +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
> *bitmap,
> > +                               unsigned long len, int page_shift,
> > +bool deflate) { #if defined(__linux__)
> 
> Why do you need this #if here?
> 

Ooh,  it is wrong to add the '#if' here, will remove.

Thanks a lot!

Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-14 14:22       ` [Qemu-devel] " Li, Liang Z
@ 2016-06-14 14:41         ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-14 14:41 UTC (permalink / raw)
  To: Li, Liang Z, Thomas Huth, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> > On 13.06.2016 12:16, Liang Li wrote:
> > > The implementation of the current virtio-balloon is not very
> > > efficient, Bellow is test result of time spends on inflating the
> > > balloon to 3GB of a 4GB idle guest:
> > >
> > > a. allocating pages (6.5%, 103ms)
> > > b. sending PFNs to host (68.3%, 787ms) c. address translation (6.1%,
> > > 96ms) d. madvise (19%, 300ms)
> > >
> > > It takes about 1577ms for the whole inflating process to complete.
> > > The test shows that the bottle neck is the stage b and stage d.
> > >
> > > If using a bitmap to send the page info instead of the PFNs, we can
> > > reduce the overhead spends on stage b quite a lot. Furthermore, it's
> > > possible to do the address translation and do the madvise with a
> > > bulk of pages, instead of the current page per page way, so the
> > > overhead of stage c and stage d can also be reduced a lot.
> > >
> > > This patch is the QEMU side implementation which is intended to
> > > speed up the inflating & deflating process by adding a new feature
> > > to the virtio-balloon device. And now, inflating the balloon to 3GB
> > > of a 4GB idle guest only takes 210ms, it's about 8 times as fast as before.
> > >
> > > TODO: optimize stage a by allocating/freeing a chunk of pages
> > > instead of a single page at a time.
> > >
> > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > ---
> > >  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
> > >  include/standard-headers/linux/virtio_balloon.h |   1 +
> > >  2 files changed, 139 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > > index 8c15e09..8cf74c2 100644
> > > --- a/hw/virtio/virtio-balloon.c
> > > +++ b/hw/virtio/virtio-balloon.c
> > > @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
> > > #endif  }
> > >
> > > +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int
> page_shift,
> > > +                                  unsigned long len, bool deflate) {
> > > +    ram_addr_t size, processed, chunk, base;
> > > +    void *addr;
> > > +    MemoryRegionSection section = {.mr = NULL};
> > > +
> > > +    size = (len << page_shift);
> > > +    base = (base_pfn << page_shift);
> > > +
> > > +    for (processed = 0; processed < size; processed += chunk) {
> > > +        chunk = size - processed;
> > > +        while (chunk >= TARGET_PAGE_SIZE) {
> > > +            section = memory_region_find(get_system_memory(),
> > > +                                         base + processed, chunk);
> > > +            if (!section.mr) {
> > > +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> > > +            } else {
> > > +                break;
> > > +            }
> > > +        }
> > > +
> > > +        if (section.mr &&
> > > +            (int128_nz(section.size) && memory_region_is_ram(section.mr)))
> {
> > > +            addr = section.offset_within_region +
> > > +                   memory_region_get_ram_ptr(section.mr);
> > > +            qemu_madvise(addr, chunk,
> > > +                         deflate ? QEMU_MADV_WILLNEED :
> > QEMU_MADV_DONTNEED);
> > > +        } else {
> > > +            fprintf(stderr, "can't find the chunk, skip\n");
> >
> > Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
> > Use error_report(...) or in this case maybe rather
> > qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
> > reasonable error message (e.g. that it is clear that the error
> > happened in the balloon code).
> >
> 
> Indeed, the error message is no good, will change in next version.
> 
> > > +            chunk = TARGET_PAGE_SIZE;
> > > +        }
> > > +    }
> > > +}
> > > +
> > > +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
> > *bitmap,
> > > +                               unsigned long len, int page_shift,
> > > +bool deflate) { #if defined(__linux__)
> >
> > Why do you need this #if here?
> >
> 
> Ooh,  it is wrong to add the '#if' here, will remove.
No, it is needed, just follow the code in balloon_page().
only Linux support the madvise().

Liang


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-14 14:41         ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-14 14:41 UTC (permalink / raw)
  To: Li, Liang Z, Thomas Huth, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> > On 13.06.2016 12:16, Liang Li wrote:
> > > The implementation of the current virtio-balloon is not very
> > > efficient, Bellow is test result of time spends on inflating the
> > > balloon to 3GB of a 4GB idle guest:
> > >
> > > a. allocating pages (6.5%, 103ms)
> > > b. sending PFNs to host (68.3%, 787ms) c. address translation (6.1%,
> > > 96ms) d. madvise (19%, 300ms)
> > >
> > > It takes about 1577ms for the whole inflating process to complete.
> > > The test shows that the bottle neck is the stage b and stage d.
> > >
> > > If using a bitmap to send the page info instead of the PFNs, we can
> > > reduce the overhead spends on stage b quite a lot. Furthermore, it's
> > > possible to do the address translation and do the madvise with a
> > > bulk of pages, instead of the current page per page way, so the
> > > overhead of stage c and stage d can also be reduced a lot.
> > >
> > > This patch is the QEMU side implementation which is intended to
> > > speed up the inflating & deflating process by adding a new feature
> > > to the virtio-balloon device. And now, inflating the balloon to 3GB
> > > of a 4GB idle guest only takes 210ms, it's about 8 times as fast as before.
> > >
> > > TODO: optimize stage a by allocating/freeing a chunk of pages
> > > instead of a single page at a time.
> > >
> > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > ---
> > >  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
> > >  include/standard-headers/linux/virtio_balloon.h |   1 +
> > >  2 files changed, 139 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > > index 8c15e09..8cf74c2 100644
> > > --- a/hw/virtio/virtio-balloon.c
> > > +++ b/hw/virtio/virtio-balloon.c
> > > @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
> > > #endif  }
> > >
> > > +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int
> page_shift,
> > > +                                  unsigned long len, bool deflate) {
> > > +    ram_addr_t size, processed, chunk, base;
> > > +    void *addr;
> > > +    MemoryRegionSection section = {.mr = NULL};
> > > +
> > > +    size = (len << page_shift);
> > > +    base = (base_pfn << page_shift);
> > > +
> > > +    for (processed = 0; processed < size; processed += chunk) {
> > > +        chunk = size - processed;
> > > +        while (chunk >= TARGET_PAGE_SIZE) {
> > > +            section = memory_region_find(get_system_memory(),
> > > +                                         base + processed, chunk);
> > > +            if (!section.mr) {
> > > +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> > > +            } else {
> > > +                break;
> > > +            }
> > > +        }
> > > +
> > > +        if (section.mr &&
> > > +            (int128_nz(section.size) && memory_region_is_ram(section.mr)))
> {
> > > +            addr = section.offset_within_region +
> > > +                   memory_region_get_ram_ptr(section.mr);
> > > +            qemu_madvise(addr, chunk,
> > > +                         deflate ? QEMU_MADV_WILLNEED :
> > QEMU_MADV_DONTNEED);
> > > +        } else {
> > > +            fprintf(stderr, "can't find the chunk, skip\n");
> >
> > Please try to avoid new fprintf(stderr, ...) in the QEMU sources.
> > Use error_report(...) or in this case maybe rather
> > qemu_log_mask(LOG_GUEST_ERROR, ...) instead, and try to use a more
> > reasonable error message (e.g. that it is clear that the error
> > happened in the balloon code).
> >
> 
> Indeed, the error message is no good, will change in next version.
> 
> > > +            chunk = TARGET_PAGE_SIZE;
> > > +        }
> > > +    }
> > > +}
> > > +
> > > +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
> > *bitmap,
> > > +                               unsigned long len, int page_shift,
> > > +bool deflate) { #if defined(__linux__)
> >
> > Why do you need this #if here?
> >
> 
> Ooh,  it is wrong to add the '#if' here, will remove.
No, it is needed, just follow the code in balloon_page().
only Linux support the madvise().

Liang


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-14 14:41         ` [Qemu-devel] " Li, Liang Z
@ 2016-06-14 15:33           ` Thomas Huth
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Huth @ 2016-06-14 15:33 UTC (permalink / raw)
  To: Li, Liang Z, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On 14.06.2016 16:41, Li, Liang Z wrote:
>>> On 13.06.2016 12:16, Liang Li wrote:
>>>> The implementation of the current virtio-balloon is not very
>>>> efficient, Bellow is test result of time spends on inflating the
>>>> balloon to 3GB of a 4GB idle guest:
>>>>
>>>> a. allocating pages (6.5%, 103ms)
>>>> b. sending PFNs to host (68.3%, 787ms) c. address translation (6.1%,
>>>> 96ms) d. madvise (19%, 300ms)
>>>>
>>>> It takes about 1577ms for the whole inflating process to complete.
>>>> The test shows that the bottle neck is the stage b and stage d.
>>>>
>>>> If using a bitmap to send the page info instead of the PFNs, we can
>>>> reduce the overhead spends on stage b quite a lot. Furthermore, it's
>>>> possible to do the address translation and do the madvise with a
>>>> bulk of pages, instead of the current page per page way, so the
>>>> overhead of stage c and stage d can also be reduced a lot.
>>>>
>>>> This patch is the QEMU side implementation which is intended to
>>>> speed up the inflating & deflating process by adding a new feature
>>>> to the virtio-balloon device. And now, inflating the balloon to 3GB
>>>> of a 4GB idle guest only takes 210ms, it's about 8 times as fast as before.
[...]
>>>> +            chunk = TARGET_PAGE_SIZE;
>>>> +        }
>>>> +    }
>>>> +}
>>>> +
>>>> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
>>> *bitmap,
>>>> +                               unsigned long len, int page_shift,
>>>> +bool deflate) { #if defined(__linux__)
>>>
>>> Why do you need this #if here?
>>>
>> Ooh,  it is wrong to add the '#if' here, will remove.
>>
> No, it is needed, just follow the code in balloon_page().
> only Linux support the madvise().

I think it is not needed anymore today and the #if in balloon_page could
be removed, too: As far as I can see, the #if there is from the early
days, when there was no wrapper around madvise() yet. But nowadays,
we've got the qemu_madvise() wrapper which takes care of either using
madvise(), posix_madvise() or doing nothing, so the virtio-balloon code
should be able to work without the #if now.

 Thomas


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-14 15:33           ` Thomas Huth
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Huth @ 2016-06-14 15:33 UTC (permalink / raw)
  To: Li, Liang Z, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On 14.06.2016 16:41, Li, Liang Z wrote:
>>> On 13.06.2016 12:16, Liang Li wrote:
>>>> The implementation of the current virtio-balloon is not very
>>>> efficient, Bellow is test result of time spends on inflating the
>>>> balloon to 3GB of a 4GB idle guest:
>>>>
>>>> a. allocating pages (6.5%, 103ms)
>>>> b. sending PFNs to host (68.3%, 787ms) c. address translation (6.1%,
>>>> 96ms) d. madvise (19%, 300ms)
>>>>
>>>> It takes about 1577ms for the whole inflating process to complete.
>>>> The test shows that the bottle neck is the stage b and stage d.
>>>>
>>>> If using a bitmap to send the page info instead of the PFNs, we can
>>>> reduce the overhead spends on stage b quite a lot. Furthermore, it's
>>>> possible to do the address translation and do the madvise with a
>>>> bulk of pages, instead of the current page per page way, so the
>>>> overhead of stage c and stage d can also be reduced a lot.
>>>>
>>>> This patch is the QEMU side implementation which is intended to
>>>> speed up the inflating & deflating process by adding a new feature
>>>> to the virtio-balloon device. And now, inflating the balloon to 3GB
>>>> of a 4GB idle guest only takes 210ms, it's about 8 times as fast as before.
[...]
>>>> +            chunk = TARGET_PAGE_SIZE;
>>>> +        }
>>>> +    }
>>>> +}
>>>> +
>>>> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
>>> *bitmap,
>>>> +                               unsigned long len, int page_shift,
>>>> +bool deflate) { #if defined(__linux__)
>>>
>>> Why do you need this #if here?
>>>
>> Ooh,  it is wrong to add the '#if' here, will remove.
>>
> No, it is needed, just follow the code in balloon_page().
> only Linux support the madvise().

I think it is not needed anymore today and the #if in balloon_page could
be removed, too: As far as I can see, the #if there is from the early
days, when there was no wrapper around madvise() yet. But nowadays,
we've got the qemu_madvise() wrapper which takes care of either using
madvise(), posix_madvise() or doing nothing, so the virtio-balloon code
should be able to work without the #if now.

 Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-14 15:33           ` [Qemu-devel] " Thomas Huth
@ 2016-06-17  0:54             ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-17  0:54 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> >>>> +            chunk = TARGET_PAGE_SIZE;
> >>>> +        }
> >>>> +    }
> >>>> +}
> >>>> +
> >>>> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
> >>> *bitmap,
> >>>> +                               unsigned long len, int page_shift,
> >>>> +bool deflate) { #if defined(__linux__)
> >>>
> >>> Why do you need this #if here?
> >>>
> >> Ooh,  it is wrong to add the '#if' here, will remove.
> >>
> > No, it is needed, just follow the code in balloon_page().
> > only Linux support the madvise().
> 
> I think it is not needed anymore today and the #if in balloon_page could be
> removed, too: As far as I can see, the #if there is from the early days, when
> there was no wrapper around madvise() yet. But nowadays, we've got the
> qemu_madvise() wrapper which takes care of either using madvise(),
> posix_madvise() or doing nothing, so the virtio-balloon code should be able
> to work without the #if now.
> 
>  Thomas

You are right! I will remove both of them.

Thanks!
Liang




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-17  0:54             ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-17  0:54 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel
  Cc: kvm, mst, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> >>>> +            chunk = TARGET_PAGE_SIZE;
> >>>> +        }
> >>>> +    }
> >>>> +}
> >>>> +
> >>>> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long
> >>> *bitmap,
> >>>> +                               unsigned long len, int page_shift,
> >>>> +bool deflate) { #if defined(__linux__)
> >>>
> >>> Why do you need this #if here?
> >>>
> >> Ooh,  it is wrong to add the '#if' here, will remove.
> >>
> > No, it is needed, just follow the code in balloon_page().
> > only Linux support the madvise().
> 
> I think it is not needed anymore today and the #if in balloon_page could be
> removed, too: As far as I can see, the #if there is from the early days, when
> there was no wrapper around madvise() yet. But nowadays, we've got the
> qemu_madvise() wrapper which takes care of either using madvise(),
> posix_madvise() or doing nothing, so the virtio-balloon code should be able
> to work without the #if now.
> 
>  Thomas

You are right! I will remove both of them.

Thanks!
Liang




^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
  2016-06-13 15:09       ` Dr. David Alan Gilbert
@ 2016-06-17  1:35           ` Li, Liang Z
  2016-06-17  1:35           ` Li, Liang Z
  1 sibling, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-17  1:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrange, qemu-devel, kvm, quintela, mst, lcapitulino,
	amit.shah, pbonzini

> > > > +{ 'command': 'balloon_drop_cache', 'data': {'value':
> > > > +'DropCacheType'} }
> > >
> > > Also, as noted in the man page quote above, it is recommended to
> > > call
> > > sync() to minimise dirty pages. Should we have a way to request a
> > > sync as part of this monitor command.
> > >
> > > More generally, it feels like this is taking as down a path towards
> > > actively managing the guest kernel VM from the host. Is this really
> > > a path we want to be going down, given that its going to take us
> > > into increasing non-portable concepts which are potentially
> > > different for each guest OS kernel.  Is this drop caches feature at
> > > all applicable to Windows, OS-X, *BSD guest OS impls of the balloon
> > > driver ? If it is applicable, are the 3 fixed constants you've
> >
> > No.
> >
> > > defined at all useful to those other OS ?
> > >
> >
> > Maybe they are not.
> > I agree that there are too Linux specific.  And I did more than needed.
> > Actually, I just want to drop the clean cache, do more than that is
> > too heavy and no good for performance.
> >
> > > I'm warying of us taking a design path which is so Linux specific it
> > > isn't useful elsewhere. IOW, just because we can do this, doesn't
> > > mean we should do this...
> > >
> >
> > Agree.
> 
> I can see an argument for giving the guest a hint about what's going on and
> letting the guest decide what it's going to do - so telling the guest that a
> migration is happening and you'd like it to make the hosts life easy seems
> reasonable and it doesn't make any guest OS assumptions.
> 

That's much better. Thanks!

Liang


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 3/7] Add the hmp and qmp interface for dropping cache
@ 2016-06-17  1:35           ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-17  1:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrange, qemu-devel, kvm, quintela, mst, lcapitulino,
	amit.shah, pbonzini

> > > > +{ 'command': 'balloon_drop_cache', 'data': {'value':
> > > > +'DropCacheType'} }
> > >
> > > Also, as noted in the man page quote above, it is recommended to
> > > call
> > > sync() to minimise dirty pages. Should we have a way to request a
> > > sync as part of this monitor command.
> > >
> > > More generally, it feels like this is taking as down a path towards
> > > actively managing the guest kernel VM from the host. Is this really
> > > a path we want to be going down, given that its going to take us
> > > into increasing non-portable concepts which are potentially
> > > different for each guest OS kernel.  Is this drop caches feature at
> > > all applicable to Windows, OS-X, *BSD guest OS impls of the balloon
> > > driver ? If it is applicable, are the 3 fixed constants you've
> >
> > No.
> >
> > > defined at all useful to those other OS ?
> > >
> >
> > Maybe they are not.
> > I agree that there are too Linux specific.  And I did more than needed.
> > Actually, I just want to drop the clean cache, do more than that is
> > too heavy and no good for performance.
> >
> > > I'm warying of us taking a design path which is so Linux specific it
> > > isn't useful elsewhere. IOW, just because we can do this, doesn't
> > > mean we should do this...
> > >
> >
> > Agree.
> 
> I can see an argument for giving the guest a hint about what's going on and
> letting the guest decide what it's going to do - so telling the guest that a
> migration is happening and you'd like it to make the hosts life easy seems
> reasonable and it doesn't make any guest OS assumptions.
> 

That's much better. Thanks!

Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-13 10:16   ` [Qemu-devel] " Liang Li
@ 2016-06-19  4:12     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:12 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:43PM +0800, Liang Li wrote:
> The implementation of the current virtio-balloon is not very efficient,
> Bellow is test result of time spends on inflating the balloon to 3GB of
> a 4GB idle guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete. The
> test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can
> reduce the overhead spends on stage b quite a lot. Furthermore, it's
> possible to do the address translation and do the madvise with a bulk
> of pages, instead of the current page per page way, so the overhead of
> stage c and stage d can also be reduced a lot.
> 
> This patch is the QEMU side implementation which is intended to speed
> up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> idle guest only takes 210ms, it's about 8 times as fast as before.
> 
> TODO: optimize stage a by allocating/freeing a chunk of pages instead
> of a single page at a time.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
>  include/standard-headers/linux/virtio_balloon.h |   1 +
>  2 files changed, 139 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8c15e09..8cf74c2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
>  #endif
>  }
>  
> +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> +                                  unsigned long len, bool deflate)
> +{
> +    ram_addr_t size, processed, chunk, base;
> +    void *addr;
> +    MemoryRegionSection section = {.mr = NULL};
> +
> +    size = (len << page_shift);
> +    base = (base_pfn << page_shift);
> +
> +    for (processed = 0; processed < size; processed += chunk) {
> +        chunk = size - processed;
> +        while (chunk >= TARGET_PAGE_SIZE) {
> +            section = memory_region_find(get_system_memory(),
> +                                         base + processed, chunk);
> +            if (!section.mr) {
> +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> +            } else {
> +                break;
> +            }
> +        }
> +
> +        if (section.mr &&
> +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> +            addr = section.offset_within_region +
> +                   memory_region_get_ram_ptr(section.mr);
> +            qemu_madvise(addr, chunk,
> +                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +        } else {
> +            fprintf(stderr, "can't find the chunk, skip\n");
> +            chunk = TARGET_PAGE_SIZE;
> +        }
> +    }
> +}
> +
> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
> +                               unsigned long len, int page_shift, bool deflate)
> +{
> +#if defined(__linux__)
> +    unsigned long end  = len * 8;
> +    unsigned long current = 0;
> +
> +    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> +                                         kvm_has_sync_mmu())) {
> +        while (current < end) {
> +            unsigned long one = find_next_bit(bitmap, end, current);
> +
> +            if (one < end) {
> +                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
> +                unsigned long page_length;
> +
> +                if (zero >= end) {
> +                    page_length = end - one;
> +                } else {
> +                    page_length = zero - one;
> +                }
> +
> +                if (page_length) {
> +                    do_balloon_bulk_pages(base_pfn + one, page_shift,
> +                                          page_length, deflate);
> +                }
> +                current = one + page_length;
> +            } else {
> +                current = one;
> +            }
> +        }
> +    }
> +#endif
> +}
> +
>  static const char *balloon_stat_names[] = {
>     [VIRTIO_BALLOON_S_SWAP_IN] = "stat-swap-in",
>     [VIRTIO_BALLOON_S_SWAP_OUT] = "stat-swap-out",
> @@ -78,6 +148,12 @@ static bool balloon_stats_supported(const VirtIOBalloon *s)
>      return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ);
>  }
>  
> +static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
> +{
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
> +}
> +
>  static bool balloon_stats_enabled(const VirtIOBalloon *s)
>  {
>      return s->stats_poll_interval > 0;
> @@ -224,27 +300,66 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>              return;
>          }
>  
> -        while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
> -            ram_addr_t pa;
> -            ram_addr_t addr;
> -            int p = virtio_ldl_p(vdev, &pfn);
> -
> -            pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
> -            offset += 4;
> -
> -            /* FIXME: remove get_system_memory(), but how? */
> -            section = memory_region_find(get_system_memory(), pa, 1);
> -            if (!int128_nz(section.size) || !memory_region_is_ram(section.mr))
> -                continue;
> -
> -            trace_virtio_balloon_handle_output(memory_region_name(section.mr),
> -                                               pa);
> -            /* Using memory_region_get_ram_ptr is bending the rules a bit, but
> -               should be OK because we only want a single page.  */
> -            addr = section.offset_within_region;
> -            balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
> -                         !!(vq == s->dvq));
> -            memory_region_unref(section.mr);
> +        if (balloon_page_bitmap_supported(s)) {
> +            uint64_t base_pfn, tmp64, bmap_len;
> +            uint32_t tmp32, page_shift, id;
> +            unsigned long *bitmap;
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp32, sizeof(uint32_t));
> +            id = virtio_ldl_p(vdev, &tmp32);
> +            offset += sizeof(uint32_t);
> +            /* to suppress build warning */
> +            id = id;
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp32, sizeof(uint32_t));
> +            page_shift = virtio_ldl_p(vdev, &tmp32);
> +            offset += sizeof(uint32_t);
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp64, sizeof(uint64_t));
> +            base_pfn = virtio_ldq_p(vdev, &tmp64);
> +            offset += sizeof(uint64_t);
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp64, sizeof(uint64_t));
> +            bmap_len = virtio_ldq_p(vdev, &tmp64);
> +            offset += sizeof(uint64_t);
> +
> +            bitmap = bitmap_new(bmap_len * BITS_PER_BYTE);
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       bitmap, bmap_len);
> +            offset += bmap_len;
> +
> +            balloon_bulk_pages(base_pfn, bitmap, bmap_len,
> +                               page_shift, !!(vq == s->dvq));
> +            g_free(bitmap);
> +        } else {
> +            while (iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                              &pfn, 4) == 4) {
> +                ram_addr_t pa;
> +                ram_addr_t addr;
> +                int p = virtio_ldl_p(vdev, &pfn);
> +
> +                pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
> +                offset += 4;
> +
> +                /* FIXME: remove get_system_memory(), but how? */
> +                section = memory_region_find(get_system_memory(), pa, 1);
> +                if (!int128_nz(section.size) ||
> +                    !memory_region_is_ram(section.mr))
> +                    continue;
> +
> +                trace_virtio_balloon_handle_output(memory_region_name(
> +                                                            section.mr), pa);
> +                /* Using memory_region_get_ram_ptr is bending the rules a bit,
> +                 * but should be OK because we only want a single page.  */
> +                addr = section.offset_within_region;
> +                balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
> +                             !!(vq == s->dvq));
> +                memory_region_unref(section.mr);
> +            }
>          }
>  
>          virtqueue_push(vq, elem, offset);
> @@ -374,6 +489,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>      f |= dev->host_features;
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
>      return f;
>  }
>

Pls add features to virtio_balloon_properties.
You also need to handle compatibility by disabling for
old machine types.

  
> @@ -388,6 +504,7 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +
>      ram_addr_t vm_ram_size = get_current_ram_size();
>  
>      if (target > vm_ram_size) {
> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> index 9d06ccd..7c9686c 100644
> --- a/include/standard-headers/linux/virtio_balloon.h
> +++ b/include/standard-headers/linux/virtio_balloon.h
> @@ -34,6 +34,7 @@
>  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
> +#define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12

We want to keep this in sync with Linux.
Let's get a minimal patch to extend this header merged in linux, then update this one.

> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-19  4:12     ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:12 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:43PM +0800, Liang Li wrote:
> The implementation of the current virtio-balloon is not very efficient,
> Bellow is test result of time spends on inflating the balloon to 3GB of
> a 4GB idle guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete. The
> test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can
> reduce the overhead spends on stage b quite a lot. Furthermore, it's
> possible to do the address translation and do the madvise with a bulk
> of pages, instead of the current page per page way, so the overhead of
> stage c and stage d can also be reduced a lot.
> 
> This patch is the QEMU side implementation which is intended to speed
> up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> idle guest only takes 210ms, it's about 8 times as fast as before.
> 
> TODO: optimize stage a by allocating/freeing a chunk of pages instead
> of a single page at a time.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  hw/virtio/virtio-balloon.c                      | 159 ++++++++++++++++++++----
>  include/standard-headers/linux/virtio_balloon.h |   1 +
>  2 files changed, 139 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8c15e09..8cf74c2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -47,6 +47,76 @@ static void balloon_page(void *addr, int deflate)
>  #endif
>  }
>  
> +static void do_balloon_bulk_pages(ram_addr_t base_pfn, int page_shift,
> +                                  unsigned long len, bool deflate)
> +{
> +    ram_addr_t size, processed, chunk, base;
> +    void *addr;
> +    MemoryRegionSection section = {.mr = NULL};
> +
> +    size = (len << page_shift);
> +    base = (base_pfn << page_shift);
> +
> +    for (processed = 0; processed < size; processed += chunk) {
> +        chunk = size - processed;
> +        while (chunk >= TARGET_PAGE_SIZE) {
> +            section = memory_region_find(get_system_memory(),
> +                                         base + processed, chunk);
> +            if (!section.mr) {
> +                chunk = QEMU_ALIGN_DOWN(chunk / 2, TARGET_PAGE_SIZE);
> +            } else {
> +                break;
> +            }
> +        }
> +
> +        if (section.mr &&
> +            (int128_nz(section.size) && memory_region_is_ram(section.mr))) {
> +            addr = section.offset_within_region +
> +                   memory_region_get_ram_ptr(section.mr);
> +            qemu_madvise(addr, chunk,
> +                         deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +        } else {
> +            fprintf(stderr, "can't find the chunk, skip\n");
> +            chunk = TARGET_PAGE_SIZE;
> +        }
> +    }
> +}
> +
> +static void balloon_bulk_pages(ram_addr_t base_pfn, unsigned long *bitmap,
> +                               unsigned long len, int page_shift, bool deflate)
> +{
> +#if defined(__linux__)
> +    unsigned long end  = len * 8;
> +    unsigned long current = 0;
> +
> +    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> +                                         kvm_has_sync_mmu())) {
> +        while (current < end) {
> +            unsigned long one = find_next_bit(bitmap, end, current);
> +
> +            if (one < end) {
> +                unsigned long zero = find_next_zero_bit(bitmap, end, one + 1);
> +                unsigned long page_length;
> +
> +                if (zero >= end) {
> +                    page_length = end - one;
> +                } else {
> +                    page_length = zero - one;
> +                }
> +
> +                if (page_length) {
> +                    do_balloon_bulk_pages(base_pfn + one, page_shift,
> +                                          page_length, deflate);
> +                }
> +                current = one + page_length;
> +            } else {
> +                current = one;
> +            }
> +        }
> +    }
> +#endif
> +}
> +
>  static const char *balloon_stat_names[] = {
>     [VIRTIO_BALLOON_S_SWAP_IN] = "stat-swap-in",
>     [VIRTIO_BALLOON_S_SWAP_OUT] = "stat-swap-out",
> @@ -78,6 +148,12 @@ static bool balloon_stats_supported(const VirtIOBalloon *s)
>      return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ);
>  }
>  
> +static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
> +{
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
> +}
> +
>  static bool balloon_stats_enabled(const VirtIOBalloon *s)
>  {
>      return s->stats_poll_interval > 0;
> @@ -224,27 +300,66 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>              return;
>          }
>  
> -        while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
> -            ram_addr_t pa;
> -            ram_addr_t addr;
> -            int p = virtio_ldl_p(vdev, &pfn);
> -
> -            pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
> -            offset += 4;
> -
> -            /* FIXME: remove get_system_memory(), but how? */
> -            section = memory_region_find(get_system_memory(), pa, 1);
> -            if (!int128_nz(section.size) || !memory_region_is_ram(section.mr))
> -                continue;
> -
> -            trace_virtio_balloon_handle_output(memory_region_name(section.mr),
> -                                               pa);
> -            /* Using memory_region_get_ram_ptr is bending the rules a bit, but
> -               should be OK because we only want a single page.  */
> -            addr = section.offset_within_region;
> -            balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
> -                         !!(vq == s->dvq));
> -            memory_region_unref(section.mr);
> +        if (balloon_page_bitmap_supported(s)) {
> +            uint64_t base_pfn, tmp64, bmap_len;
> +            uint32_t tmp32, page_shift, id;
> +            unsigned long *bitmap;
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp32, sizeof(uint32_t));
> +            id = virtio_ldl_p(vdev, &tmp32);
> +            offset += sizeof(uint32_t);
> +            /* to suppress build warning */
> +            id = id;
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp32, sizeof(uint32_t));
> +            page_shift = virtio_ldl_p(vdev, &tmp32);
> +            offset += sizeof(uint32_t);
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp64, sizeof(uint64_t));
> +            base_pfn = virtio_ldq_p(vdev, &tmp64);
> +            offset += sizeof(uint64_t);
> +
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       &tmp64, sizeof(uint64_t));
> +            bmap_len = virtio_ldq_p(vdev, &tmp64);
> +            offset += sizeof(uint64_t);
> +
> +            bitmap = bitmap_new(bmap_len * BITS_PER_BYTE);
> +            iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                       bitmap, bmap_len);
> +            offset += bmap_len;
> +
> +            balloon_bulk_pages(base_pfn, bitmap, bmap_len,
> +                               page_shift, !!(vq == s->dvq));
> +            g_free(bitmap);
> +        } else {
> +            while (iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                              &pfn, 4) == 4) {
> +                ram_addr_t pa;
> +                ram_addr_t addr;
> +                int p = virtio_ldl_p(vdev, &pfn);
> +
> +                pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
> +                offset += 4;
> +
> +                /* FIXME: remove get_system_memory(), but how? */
> +                section = memory_region_find(get_system_memory(), pa, 1);
> +                if (!int128_nz(section.size) ||
> +                    !memory_region_is_ram(section.mr))
> +                    continue;
> +
> +                trace_virtio_balloon_handle_output(memory_region_name(
> +                                                            section.mr), pa);
> +                /* Using memory_region_get_ram_ptr is bending the rules a bit,
> +                 * but should be OK because we only want a single page.  */
> +                addr = section.offset_within_region;
> +                balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
> +                             !!(vq == s->dvq));
> +                memory_region_unref(section.mr);
> +            }
>          }
>  
>          virtqueue_push(vq, elem, offset);
> @@ -374,6 +489,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>      f |= dev->host_features;
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
>      return f;
>  }
>

Pls add features to virtio_balloon_properties.
You also need to handle compatibility by disabling for
old machine types.

  
> @@ -388,6 +504,7 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +
>      ram_addr_t vm_ram_size = get_current_ram_size();
>  
>      if (target > vm_ram_size) {
> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> index 9d06ccd..7c9686c 100644
> --- a/include/standard-headers/linux/virtio_balloon.h
> +++ b/include/standard-headers/linux/virtio_balloon.h
> @@ -34,6 +34,7 @@
>  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
> +#define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12

We want to keep this in sync with Linux.
Let's get a minimal patch to extend this header merged in linux, then update this one.

> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 2/7] virtio-balloon: add drop cache support
  2016-06-13 10:16   ` [Qemu-devel] " Liang Li
@ 2016-06-19  4:14     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:14 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:44PM +0800, Liang Li wrote:
> virtio-balloon can make use of the amount of free memory to determine
> the amount of memory to be filled in the balloon, but the amount of
> free memory will be effected by the page cache, which can be reclaimed.
> Drop the cache before getting the amount of free memory will be very
> helpful to relect the exact amount of memroy that can be reclaimed.

Can't we just extend stats to report "reclaimable" memory?

> This patch add a new feature to the balloon device to support this
> operation, hypervisor can request the VM to drop it's cache, so as to
> reclaim more memory.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  balloon.c                                       | 10 ++-
>  hw/virtio/virtio-balloon.c                      | 85 ++++++++++++++++++++++++-
>  include/hw/virtio/virtio-balloon.h              | 19 +++++-
>  include/standard-headers/linux/virtio_balloon.h |  1 +
>  include/sysemu/balloon.h                        |  5 +-
>  5 files changed, 115 insertions(+), 5 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index f2ef50c..0fb34bf 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -36,6 +36,7 @@
>  
>  static QEMUBalloonEvent *balloon_event_fn;
>  static QEMUBalloonStatus *balloon_stat_fn;
> +static QEMUBalloonDropCache *balloon_drop_cache_fn;
>  static void *balloon_opaque;
>  static bool balloon_inhibited;
>  
> @@ -65,9 +66,12 @@ static bool have_balloon(Error **errp)
>  }
>  
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -                             QEMUBalloonStatus *stat_func, void *opaque)
> +                             QEMUBalloonStatus *stat_func,
> +                             QEMUBalloonDropCache *drop_cache_func,
> +                             void *opaque)
>  {
> -    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
> +    if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
> +        || balloon_opaque) {
>          /* We're already registered one balloon handler.  How many can
>           * a guest really have?
>           */
> @@ -75,6 +79,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>      }
>      balloon_event_fn = event_func;
>      balloon_stat_fn = stat_func;
> +    balloon_drop_cache_fn = drop_cache_func;
>      balloon_opaque = opaque;
>      return 0;
>  }
> @@ -86,6 +91,7 @@ void qemu_remove_balloon_handler(void *opaque)
>      }
>      balloon_event_fn = NULL;
>      balloon_stat_fn = NULL;
> +    balloon_drop_cache_fn = NULL;
>      balloon_opaque = NULL;
>  }
>  
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8cf74c2..4757ba5 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -36,6 +36,10 @@
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> +enum balloon_req_id {
> +       BALLOON_DROP_CACHE,
> +};
> +
>  static void balloon_page(void *addr, int deflate)
>  {
>  #if defined(__linux__)
> @@ -154,6 +158,12 @@ static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
>      return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
>  }
>  
> +static bool balloon_misc_supported(const VirtIOBalloon *s)
> +{
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_MISC);
> +}
> +
>  static bool balloon_stats_enabled(const VirtIOBalloon *s)
>  {
>      return s->stats_poll_interval > 0;
> @@ -420,6 +430,39 @@ out:
>      }
>  }
>  
> +static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> +    VirtQueueElement *elem;
> +    size_t offset = 0;
> +    uint32_t tmp32, id = 0;
> +
> +    elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +    if (!elem) {
> +        s->req_status = REQ_ERROR;
> +        return;
> +    }
> +
> +    s->misc_vq_elem = elem;
> +
> +    if (!elem->out_num) {
> +        return;
> +    }
> +
> +    iov_to_buf(elem->out_sg, elem->out_num, offset,
> +               &tmp32, sizeof(uint32_t));
> +    id = virtio_ldl_p(vdev, &tmp32);
> +    offset += sizeof(uint32_t);
> +    switch (id) {
> +    case BALLOON_DROP_CACHE:
> +        s->req_status = REQ_DONE;
> +        break;
> +    default:
> +        break;
> +    }
> +
> +}
> +
>  static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
> @@ -490,6 +533,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>      f |= dev->host_features;
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_MISC);
>      return f;
>  }
>  
> @@ -500,6 +544,36 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
>                                               VIRTIO_BALLOON_PFN_SHIFT);
>  }
>  
> +static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    VirtQueueElement *elem = s->misc_vq_elem;
> +    int len;
> +
> +    if (!balloon_misc_supported(s)) {
> +        return REQ_UNSUPPORT;
> +    }
> +
> +    if (elem == NULL || !elem->in_num) {
> +        elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
> +        if (!elem) {
> +            return REQ_ERROR;
> +        }
> +        s->misc_vq_elem = elem;
> +    }
> +    s->misc_req.id = BALLOON_DROP_CACHE;
> +    s->misc_req.param = type;
> +    len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
> +                       sizeof(s->misc_req));
> +    virtqueue_push(s->mvq, elem, len);
> +    virtio_notify(vdev, s->mvq);
> +    g_free(s->misc_vq_elem);
> +    s->misc_vq_elem = NULL;
> +
> +    return REQ_DONE;
> +}
> +
>  static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
> @@ -562,7 +636,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>                  sizeof(struct virtio_balloon_config));
>  
>      ret = qemu_add_balloon_handler(virtio_balloon_to_target,
> -                                   virtio_balloon_stat, s);
> +                                   virtio_balloon_stat,
> +                                   virtio_balloon_drop_cache, s);
>  
>      if (ret < 0) {
>          error_setg(errp, "Only one balloon device is supported");
> @@ -573,8 +648,10 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>      s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
> +    s->mvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_resp);
>  
>      reset_stats(s);
> +    s->req_status = REQ_INIT;
>  
>      register_savevm(dev, "virtio-balloon", -1, 1,
>                      virtio_balloon_save, virtio_balloon_load, s);
> @@ -599,6 +676,12 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
>          g_free(s->stats_vq_elem);
>          s->stats_vq_elem = NULL;
>      }
> +
> +    if (s->misc_vq_elem != NULL) {
> +        g_free(s->misc_vq_elem);
> +        s->misc_vq_elem = NULL;
> +    }
> +    s->req_status = REQ_INIT;
>  }
>  
>  static void virtio_balloon_instance_init(Object *obj)
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 35f62ac..a21bb45 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -23,6 +23,20 @@
>  #define VIRTIO_BALLOON(obj) \
>          OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
>  
> +typedef enum {
> +    REQ_INIT,
> +    REQ_ON_GOING,
> +    REQ_DONE,
> +    REQ_ERROR,
> +    REQ_INVALID_PARAM,
> +    REQ_UNSUPPORT,
> +} BalloonReqStatus;
> +
> +typedef struct GetFreePageReq {
> +    uint32_t id;
> +    uint32_t param;
> +} MiscReq;
> +
>  typedef struct virtio_balloon_stat VirtIOBalloonStat;
>  
>  typedef struct virtio_balloon_stat_modern {
> @@ -33,16 +47,19 @@ typedef struct virtio_balloon_stat_modern {
>  
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
> -    VirtQueue *ivq, *dvq, *svq;
> +    VirtQueue *ivq, *dvq, *svq, *mvq;
>      uint32_t num_pages;
>      uint32_t actual;
>      uint64_t stats[VIRTIO_BALLOON_S_NR];
>      VirtQueueElement *stats_vq_elem;
> +    VirtQueueElement *misc_vq_elem;
>      size_t stats_vq_offset;
>      QEMUTimer *stats_timer;
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
>      uint32_t host_features;
> +    MiscReq misc_req;
> +    BalloonReqStatus req_status;
>  } VirtIOBalloon;
>  
>  #endif
> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> index 7c9686c..c8b254f 100644
> --- a/include/standard-headers/linux/virtio_balloon.h
> +++ b/include/standard-headers/linux/virtio_balloon.h
> @@ -35,6 +35,7 @@
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
>  #define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
> +#define VIRTIO_BALLOON_F_MISC    4 /* Send request and get misc info */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
> index 3f976b4..0e85f2b 100644
> --- a/include/sysemu/balloon.h
> +++ b/include/sysemu/balloon.h
> @@ -18,9 +18,12 @@
>  
>  typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>  typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
> +typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
>  
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -			     QEMUBalloonStatus *stat_func, void *opaque);
> +                             QEMUBalloonStatus *stat_func,
> +                             QEMUBalloonDropCache *drop_cache_func,
> +                             void *opaque);
>  void qemu_remove_balloon_handler(void *opaque);
>  bool qemu_balloon_is_inhibited(void);
>  void qemu_balloon_inhibit(bool state);
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 2/7] virtio-balloon: add drop cache support
@ 2016-06-19  4:14     ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:14 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:44PM +0800, Liang Li wrote:
> virtio-balloon can make use of the amount of free memory to determine
> the amount of memory to be filled in the balloon, but the amount of
> free memory will be effected by the page cache, which can be reclaimed.
> Drop the cache before getting the amount of free memory will be very
> helpful to relect the exact amount of memroy that can be reclaimed.

Can't we just extend stats to report "reclaimable" memory?

> This patch add a new feature to the balloon device to support this
> operation, hypervisor can request the VM to drop it's cache, so as to
> reclaim more memory.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  balloon.c                                       | 10 ++-
>  hw/virtio/virtio-balloon.c                      | 85 ++++++++++++++++++++++++-
>  include/hw/virtio/virtio-balloon.h              | 19 +++++-
>  include/standard-headers/linux/virtio_balloon.h |  1 +
>  include/sysemu/balloon.h                        |  5 +-
>  5 files changed, 115 insertions(+), 5 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index f2ef50c..0fb34bf 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -36,6 +36,7 @@
>  
>  static QEMUBalloonEvent *balloon_event_fn;
>  static QEMUBalloonStatus *balloon_stat_fn;
> +static QEMUBalloonDropCache *balloon_drop_cache_fn;
>  static void *balloon_opaque;
>  static bool balloon_inhibited;
>  
> @@ -65,9 +66,12 @@ static bool have_balloon(Error **errp)
>  }
>  
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -                             QEMUBalloonStatus *stat_func, void *opaque)
> +                             QEMUBalloonStatus *stat_func,
> +                             QEMUBalloonDropCache *drop_cache_func,
> +                             void *opaque)
>  {
> -    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
> +    if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
> +        || balloon_opaque) {
>          /* We're already registered one balloon handler.  How many can
>           * a guest really have?
>           */
> @@ -75,6 +79,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>      }
>      balloon_event_fn = event_func;
>      balloon_stat_fn = stat_func;
> +    balloon_drop_cache_fn = drop_cache_func;
>      balloon_opaque = opaque;
>      return 0;
>  }
> @@ -86,6 +91,7 @@ void qemu_remove_balloon_handler(void *opaque)
>      }
>      balloon_event_fn = NULL;
>      balloon_stat_fn = NULL;
> +    balloon_drop_cache_fn = NULL;
>      balloon_opaque = NULL;
>  }
>  
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 8cf74c2..4757ba5 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -36,6 +36,10 @@
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> +enum balloon_req_id {
> +       BALLOON_DROP_CACHE,
> +};
> +
>  static void balloon_page(void *addr, int deflate)
>  {
>  #if defined(__linux__)
> @@ -154,6 +158,12 @@ static bool balloon_page_bitmap_supported(const VirtIOBalloon *s)
>      return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
>  }
>  
> +static bool balloon_misc_supported(const VirtIOBalloon *s)
> +{
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_MISC);
> +}
> +
>  static bool balloon_stats_enabled(const VirtIOBalloon *s)
>  {
>      return s->stats_poll_interval > 0;
> @@ -420,6 +430,39 @@ out:
>      }
>  }
>  
> +static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> +    VirtQueueElement *elem;
> +    size_t offset = 0;
> +    uint32_t tmp32, id = 0;
> +
> +    elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +    if (!elem) {
> +        s->req_status = REQ_ERROR;
> +        return;
> +    }
> +
> +    s->misc_vq_elem = elem;
> +
> +    if (!elem->out_num) {
> +        return;
> +    }
> +
> +    iov_to_buf(elem->out_sg, elem->out_num, offset,
> +               &tmp32, sizeof(uint32_t));
> +    id = virtio_ldl_p(vdev, &tmp32);
> +    offset += sizeof(uint32_t);
> +    switch (id) {
> +    case BALLOON_DROP_CACHE:
> +        s->req_status = REQ_DONE;
> +        break;
> +    default:
> +        break;
> +    }
> +
> +}
> +
>  static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
> @@ -490,6 +533,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
>      f |= dev->host_features;
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
>      virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
> +    virtio_add_feature(&f, VIRTIO_BALLOON_F_MISC);
>      return f;
>  }
>  
> @@ -500,6 +544,36 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
>                                               VIRTIO_BALLOON_PFN_SHIFT);
>  }
>  
> +static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    VirtQueueElement *elem = s->misc_vq_elem;
> +    int len;
> +
> +    if (!balloon_misc_supported(s)) {
> +        return REQ_UNSUPPORT;
> +    }
> +
> +    if (elem == NULL || !elem->in_num) {
> +        elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
> +        if (!elem) {
> +            return REQ_ERROR;
> +        }
> +        s->misc_vq_elem = elem;
> +    }
> +    s->misc_req.id = BALLOON_DROP_CACHE;
> +    s->misc_req.param = type;
> +    len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
> +                       sizeof(s->misc_req));
> +    virtqueue_push(s->mvq, elem, len);
> +    virtio_notify(vdev, s->mvq);
> +    g_free(s->misc_vq_elem);
> +    s->misc_vq_elem = NULL;
> +
> +    return REQ_DONE;
> +}
> +
>  static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
> @@ -562,7 +636,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>                  sizeof(struct virtio_balloon_config));
>  
>      ret = qemu_add_balloon_handler(virtio_balloon_to_target,
> -                                   virtio_balloon_stat, s);
> +                                   virtio_balloon_stat,
> +                                   virtio_balloon_drop_cache, s);
>  
>      if (ret < 0) {
>          error_setg(errp, "Only one balloon device is supported");
> @@ -573,8 +648,10 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>      s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
>      s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
> +    s->mvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_resp);
>  
>      reset_stats(s);
> +    s->req_status = REQ_INIT;
>  
>      register_savevm(dev, "virtio-balloon", -1, 1,
>                      virtio_balloon_save, virtio_balloon_load, s);
> @@ -599,6 +676,12 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
>          g_free(s->stats_vq_elem);
>          s->stats_vq_elem = NULL;
>      }
> +
> +    if (s->misc_vq_elem != NULL) {
> +        g_free(s->misc_vq_elem);
> +        s->misc_vq_elem = NULL;
> +    }
> +    s->req_status = REQ_INIT;
>  }
>  
>  static void virtio_balloon_instance_init(Object *obj)
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 35f62ac..a21bb45 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -23,6 +23,20 @@
>  #define VIRTIO_BALLOON(obj) \
>          OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
>  
> +typedef enum {
> +    REQ_INIT,
> +    REQ_ON_GOING,
> +    REQ_DONE,
> +    REQ_ERROR,
> +    REQ_INVALID_PARAM,
> +    REQ_UNSUPPORT,
> +} BalloonReqStatus;
> +
> +typedef struct GetFreePageReq {
> +    uint32_t id;
> +    uint32_t param;
> +} MiscReq;
> +
>  typedef struct virtio_balloon_stat VirtIOBalloonStat;
>  
>  typedef struct virtio_balloon_stat_modern {
> @@ -33,16 +47,19 @@ typedef struct virtio_balloon_stat_modern {
>  
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
> -    VirtQueue *ivq, *dvq, *svq;
> +    VirtQueue *ivq, *dvq, *svq, *mvq;
>      uint32_t num_pages;
>      uint32_t actual;
>      uint64_t stats[VIRTIO_BALLOON_S_NR];
>      VirtQueueElement *stats_vq_elem;
> +    VirtQueueElement *misc_vq_elem;
>      size_t stats_vq_offset;
>      QEMUTimer *stats_timer;
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
>      uint32_t host_features;
> +    MiscReq misc_req;
> +    BalloonReqStatus req_status;
>  } VirtIOBalloon;
>  
>  #endif
> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> index 7c9686c..c8b254f 100644
> --- a/include/standard-headers/linux/virtio_balloon.h
> +++ b/include/standard-headers/linux/virtio_balloon.h
> @@ -35,6 +35,7 @@
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
>  #define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to send page info */
> +#define VIRTIO_BALLOON_F_MISC    4 /* Send request and get misc info */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
> index 3f976b4..0e85f2b 100644
> --- a/include/sysemu/balloon.h
> +++ b/include/sysemu/balloon.h
> @@ -18,9 +18,12 @@
>  
>  typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>  typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
> +typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
>  
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> -			     QEMUBalloonStatus *stat_func, void *opaque);
> +                             QEMUBalloonStatus *stat_func,
> +                             QEMUBalloonDropCache *drop_cache_func,
> +                             void *opaque);
>  void qemu_remove_balloon_handler(void *opaque);
>  bool qemu_balloon_is_inhibited(void);
>  void qemu_balloon_inhibit(bool state);
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 4/7] balloon: get free page info from guest
  2016-06-13 10:16   ` [Qemu-devel] " Liang Li
@ 2016-06-19  4:24     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:24 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:46PM +0800, Liang Li wrote:
> Add a new feature to get the free page information from guest,
> the free page information is saved in a bitmap. Please note that
> 'free page' only means these pages are free before the request,
> some of the pages will become no free during the process of
> sending the free page bitmap to QEMU.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>


I don't much like this request interface.
The meaning of free page is rather fuzzy too - so at what
point are they free?


My suggestion would be:
	report free page request ID to guest
	include request ID when guest sends free page list

the definition is then:
	page was free sometime after host set this value of request
	ID and before it received response with the same ID





> ---
>  balloon.c                          | 24 +++++++++++-
>  hw/virtio/virtio-balloon.c         | 75 +++++++++++++++++++++++++++++++++++++-
>  include/hw/virtio/virtio-balloon.h |  4 ++
>  include/sysemu/balloon.h           |  8 ++++
>  4 files changed, 108 insertions(+), 3 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index 3d96111..c74c472 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -37,6 +37,7 @@
>  static QEMUBalloonEvent *balloon_event_fn;
>  static QEMUBalloonStatus *balloon_stat_fn;
>  static QEMUBalloonDropCache *balloon_drop_cache_fn;
> +static QEMUBalloonGetFreePage *balloon_get_free_page_fn;
>  static void *balloon_opaque;
>  static bool balloon_inhibited;
>  
> @@ -68,10 +69,11 @@ static bool have_balloon(Error **errp)
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>                               QEMUBalloonStatus *stat_func,
>                               QEMUBalloonDropCache *drop_cache_func,
> +                             QEMUBalloonGetFreePage *get_free_page_func,
>                               void *opaque)
>  {
>      if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
> -        || balloon_opaque) {
> +        || balloon_get_free_page_fn || balloon_opaque) {
>          /* We're already registered one balloon handler.  How many can
>           * a guest really have?
>           */
> @@ -80,6 +82,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>      balloon_event_fn = event_func;
>      balloon_stat_fn = stat_func;
>      balloon_drop_cache_fn = drop_cache_func;
> +    balloon_get_free_page_fn = get_free_page_func;
>      balloon_opaque = opaque;
>      return 0;
>  }
> @@ -92,6 +95,7 @@ void qemu_remove_balloon_handler(void *opaque)
>      balloon_event_fn = NULL;
>      balloon_stat_fn = NULL;
>      balloon_drop_cache_fn = NULL;
> +    balloon_get_free_page_fn = NULL;
>      balloon_opaque = NULL;
>  }
>  
> @@ -141,3 +145,21 @@ void qmp_balloon_drop_cache(DropCacheType type, Error **errp)
>  
>      balloon_drop_cache_fn(balloon_opaque, type);
>  }
> +
> +bool balloon_free_pages_support(void)
> +{
> +    return balloon_get_free_page_fn ? true : false;
> +}
> +
> +BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap, unsigned long len)
> +{
> +    if (!balloon_get_free_page_fn) {
> +        return REQ_UNSUPPORT;
> +    }
> +
> +    if (!bitmap) {
> +        return REQ_INVALID_PARAM;
> +    }
> +
> +    return balloon_get_free_page_fn(balloon_opaque, bitmap, len);
> +}
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 4757ba5..30ba074 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -38,6 +38,7 @@
>  
>  enum balloon_req_id {
>         BALLOON_DROP_CACHE,
> +       BALLOON_GET_FREE_PAGES,
>  };
>  
>  static void balloon_page(void *addr, int deflate)
> @@ -435,7 +436,8 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
>      VirtQueueElement *elem;
>      size_t offset = 0;
> -    uint32_t tmp32, id = 0;
> +    uint32_t tmp32, id = 0, page_shift;
> +    uint64_t base_pfn, tmp64, bmap_len;
>  
>      elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>      if (!elem) {
> @@ -457,6 +459,32 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
>      case BALLOON_DROP_CACHE:
>          s->req_status = REQ_DONE;
>          break;
> +    case BALLOON_GET_FREE_PAGES:
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   &tmp32, sizeof(uint32_t));
> +        page_shift = virtio_ldl_p(vdev, &tmp32);
> +        offset += sizeof(uint32_t);
> +        s->page_shift = page_shift;
> +
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   &tmp64, sizeof(uint64_t));
> +        base_pfn = virtio_ldq_p(vdev, &tmp64);
> +        offset += sizeof(uint64_t);
> +        s->base_pfn = base_pfn;
> +
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   &tmp64, sizeof(uint64_t));
> +        bmap_len = virtio_ldq_p(vdev, &tmp64);
> +        offset += sizeof(uint64_t);
> +        if (s->bmap_len < bmap_len) {
> +             s->req_status = REQ_INVALID_PARAM;
> +             return;
> +        }
> +
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   s->free_page_bmap, bmap_len);
> +        s->req_status = REQ_DONE;
> +       break;
>      default:
>          break;
>      }
> @@ -574,6 +602,48 @@ static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
>      return REQ_DONE;
>  }
>  
> +static BalloonReqStatus virtio_balloon_free_pages(void *opaque,
> +                                                  unsigned long *bitmap,
> +                                                  unsigned long bmap_len)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    VirtQueueElement *elem = s->misc_vq_elem;
> +    int len;
> +
> +    if (!balloon_misc_supported(s)) {
> +        return REQ_UNSUPPORT;
> +    }
> +
> +    if (s->req_status == REQ_INIT) {
> +        s->free_page_bmap = bitmap;
> +        if (elem == NULL || !elem->in_num) {
> +            elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
> +            if (!elem) {
> +                return REQ_ERROR;
> +            }
> +            s->misc_vq_elem = elem;
> +        }
> +        s->misc_req.id = BALLOON_GET_FREE_PAGES;
> +        s->misc_req.param = 0;
> +        s->bmap_len = bmap_len;
> +        len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
> +                           sizeof(s->misc_req));
> +        virtqueue_push(s->mvq, elem, len);
> +        virtio_notify(vdev, s->mvq);
> +        g_free(s->misc_vq_elem);
> +        s->misc_vq_elem = NULL;
> +        s->req_status = REQ_ON_GOING;
> +        return REQ_ERROR;
> +    } else if (s->req_status == REQ_ON_GOING) {
> +        return REQ_ON_GOING;
> +    } else if (s->req_status == REQ_DONE) {
> +        s->req_status = REQ_INIT;
> +    }
> +
> +    return REQ_DONE;
> +}
> +
>  static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
> @@ -637,7 +707,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>  
>      ret = qemu_add_balloon_handler(virtio_balloon_to_target,
>                                     virtio_balloon_stat,
> -                                   virtio_balloon_drop_cache, s);
> +                                   virtio_balloon_drop_cache,
> +                                   virtio_balloon_free_pages, s);
>  
>      if (ret < 0) {
>          error_setg(errp, "Only one balloon device is supported");
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index a21bb45..6382bcf 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -60,6 +60,10 @@ typedef struct VirtIOBalloon {
>      uint32_t host_features;
>      MiscReq misc_req;
>      BalloonReqStatus req_status;
> +    uint64_t *free_page_bmap;
> +    uint64_t bmap_len;
> +    uint64_t base_pfn;
> +    uint32_t page_shift;
>  } VirtIOBalloon;
>  
>  #endif
> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
> index 0e85f2b..6c362e8 100644
> --- a/include/sysemu/balloon.h
> +++ b/include/sysemu/balloon.h
> @@ -15,17 +15,25 @@
>  #define _QEMU_BALLOON_H
>  
>  #include "qapi-types.h"
> +#include "hw/virtio/virtio-balloon.h"
>  
>  typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>  typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
>  typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
> +typedef BalloonReqStatus (QEMUBalloonGetFreePage)(void *opaque,
> +                                                  unsigned long *bitmap,
> +                                                  unsigned long len);
>  
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>                               QEMUBalloonStatus *stat_func,
>                               QEMUBalloonDropCache *drop_cache_func,
> +                             QEMUBalloonGetFreePage *get_free_page_func,
>                               void *opaque);
>  void qemu_remove_balloon_handler(void *opaque);
>  bool qemu_balloon_is_inhibited(void);
>  void qemu_balloon_inhibit(bool state);
> +bool balloon_free_pages_support(void);
> +BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap,
> +                                        unsigned long len);
>  
>  #endif
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 4/7] balloon: get free page info from guest
@ 2016-06-19  4:24     ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:24 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:46PM +0800, Liang Li wrote:
> Add a new feature to get the free page information from guest,
> the free page information is saved in a bitmap. Please note that
> 'free page' only means these pages are free before the request,
> some of the pages will become no free during the process of
> sending the free page bitmap to QEMU.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>


I don't much like this request interface.
The meaning of free page is rather fuzzy too - so at what
point are they free?


My suggestion would be:
	report free page request ID to guest
	include request ID when guest sends free page list

the definition is then:
	page was free sometime after host set this value of request
	ID and before it received response with the same ID





> ---
>  balloon.c                          | 24 +++++++++++-
>  hw/virtio/virtio-balloon.c         | 75 +++++++++++++++++++++++++++++++++++++-
>  include/hw/virtio/virtio-balloon.h |  4 ++
>  include/sysemu/balloon.h           |  8 ++++
>  4 files changed, 108 insertions(+), 3 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index 3d96111..c74c472 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -37,6 +37,7 @@
>  static QEMUBalloonEvent *balloon_event_fn;
>  static QEMUBalloonStatus *balloon_stat_fn;
>  static QEMUBalloonDropCache *balloon_drop_cache_fn;
> +static QEMUBalloonGetFreePage *balloon_get_free_page_fn;
>  static void *balloon_opaque;
>  static bool balloon_inhibited;
>  
> @@ -68,10 +69,11 @@ static bool have_balloon(Error **errp)
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>                               QEMUBalloonStatus *stat_func,
>                               QEMUBalloonDropCache *drop_cache_func,
> +                             QEMUBalloonGetFreePage *get_free_page_func,
>                               void *opaque)
>  {
>      if (balloon_event_fn || balloon_stat_fn || balloon_drop_cache_fn
> -        || balloon_opaque) {
> +        || balloon_get_free_page_fn || balloon_opaque) {
>          /* We're already registered one balloon handler.  How many can
>           * a guest really have?
>           */
> @@ -80,6 +82,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>      balloon_event_fn = event_func;
>      balloon_stat_fn = stat_func;
>      balloon_drop_cache_fn = drop_cache_func;
> +    balloon_get_free_page_fn = get_free_page_func;
>      balloon_opaque = opaque;
>      return 0;
>  }
> @@ -92,6 +95,7 @@ void qemu_remove_balloon_handler(void *opaque)
>      balloon_event_fn = NULL;
>      balloon_stat_fn = NULL;
>      balloon_drop_cache_fn = NULL;
> +    balloon_get_free_page_fn = NULL;
>      balloon_opaque = NULL;
>  }
>  
> @@ -141,3 +145,21 @@ void qmp_balloon_drop_cache(DropCacheType type, Error **errp)
>  
>      balloon_drop_cache_fn(balloon_opaque, type);
>  }
> +
> +bool balloon_free_pages_support(void)
> +{
> +    return balloon_get_free_page_fn ? true : false;
> +}
> +
> +BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap, unsigned long len)
> +{
> +    if (!balloon_get_free_page_fn) {
> +        return REQ_UNSUPPORT;
> +    }
> +
> +    if (!bitmap) {
> +        return REQ_INVALID_PARAM;
> +    }
> +
> +    return balloon_get_free_page_fn(balloon_opaque, bitmap, len);
> +}
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 4757ba5..30ba074 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -38,6 +38,7 @@
>  
>  enum balloon_req_id {
>         BALLOON_DROP_CACHE,
> +       BALLOON_GET_FREE_PAGES,
>  };
>  
>  static void balloon_page(void *addr, int deflate)
> @@ -435,7 +436,8 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
>      VirtQueueElement *elem;
>      size_t offset = 0;
> -    uint32_t tmp32, id = 0;
> +    uint32_t tmp32, id = 0, page_shift;
> +    uint64_t base_pfn, tmp64, bmap_len;
>  
>      elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>      if (!elem) {
> @@ -457,6 +459,32 @@ static void virtio_balloon_handle_resp(VirtIODevice *vdev, VirtQueue *vq)
>      case BALLOON_DROP_CACHE:
>          s->req_status = REQ_DONE;
>          break;
> +    case BALLOON_GET_FREE_PAGES:
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   &tmp32, sizeof(uint32_t));
> +        page_shift = virtio_ldl_p(vdev, &tmp32);
> +        offset += sizeof(uint32_t);
> +        s->page_shift = page_shift;
> +
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   &tmp64, sizeof(uint64_t));
> +        base_pfn = virtio_ldq_p(vdev, &tmp64);
> +        offset += sizeof(uint64_t);
> +        s->base_pfn = base_pfn;
> +
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   &tmp64, sizeof(uint64_t));
> +        bmap_len = virtio_ldq_p(vdev, &tmp64);
> +        offset += sizeof(uint64_t);
> +        if (s->bmap_len < bmap_len) {
> +             s->req_status = REQ_INVALID_PARAM;
> +             return;
> +        }
> +
> +        iov_to_buf(elem->out_sg, elem->out_num, offset,
> +                   s->free_page_bmap, bmap_len);
> +        s->req_status = REQ_DONE;
> +       break;
>      default:
>          break;
>      }
> @@ -574,6 +602,48 @@ static int virtio_balloon_drop_cache(void *opaque, unsigned long type)
>      return REQ_DONE;
>  }
>  
> +static BalloonReqStatus virtio_balloon_free_pages(void *opaque,
> +                                                  unsigned long *bitmap,
> +                                                  unsigned long bmap_len)
> +{
> +    VirtIOBalloon *s = opaque;
> +    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +    VirtQueueElement *elem = s->misc_vq_elem;
> +    int len;
> +
> +    if (!balloon_misc_supported(s)) {
> +        return REQ_UNSUPPORT;
> +    }
> +
> +    if (s->req_status == REQ_INIT) {
> +        s->free_page_bmap = bitmap;
> +        if (elem == NULL || !elem->in_num) {
> +            elem = virtqueue_pop(s->mvq, sizeof(VirtQueueElement));
> +            if (!elem) {
> +                return REQ_ERROR;
> +            }
> +            s->misc_vq_elem = elem;
> +        }
> +        s->misc_req.id = BALLOON_GET_FREE_PAGES;
> +        s->misc_req.param = 0;
> +        s->bmap_len = bmap_len;
> +        len = iov_from_buf(elem->in_sg, elem->in_num, 0, &s->misc_req,
> +                           sizeof(s->misc_req));
> +        virtqueue_push(s->mvq, elem, len);
> +        virtio_notify(vdev, s->mvq);
> +        g_free(s->misc_vq_elem);
> +        s->misc_vq_elem = NULL;
> +        s->req_status = REQ_ON_GOING;
> +        return REQ_ERROR;
> +    } else if (s->req_status == REQ_ON_GOING) {
> +        return REQ_ON_GOING;
> +    } else if (s->req_status == REQ_DONE) {
> +        s->req_status = REQ_INIT;
> +    }
> +
> +    return REQ_DONE;
> +}
> +
>  static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>  {
>      VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
> @@ -637,7 +707,8 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
>  
>      ret = qemu_add_balloon_handler(virtio_balloon_to_target,
>                                     virtio_balloon_stat,
> -                                   virtio_balloon_drop_cache, s);
> +                                   virtio_balloon_drop_cache,
> +                                   virtio_balloon_free_pages, s);
>  
>      if (ret < 0) {
>          error_setg(errp, "Only one balloon device is supported");
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index a21bb45..6382bcf 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -60,6 +60,10 @@ typedef struct VirtIOBalloon {
>      uint32_t host_features;
>      MiscReq misc_req;
>      BalloonReqStatus req_status;
> +    uint64_t *free_page_bmap;
> +    uint64_t bmap_len;
> +    uint64_t base_pfn;
> +    uint32_t page_shift;
>  } VirtIOBalloon;
>  
>  #endif
> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
> index 0e85f2b..6c362e8 100644
> --- a/include/sysemu/balloon.h
> +++ b/include/sysemu/balloon.h
> @@ -15,17 +15,25 @@
>  #define _QEMU_BALLOON_H
>  
>  #include "qapi-types.h"
> +#include "hw/virtio/virtio-balloon.h"
>  
>  typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>  typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
>  typedef int (QEMUBalloonDropCache)(void *opaque, unsigned long ctrl);
> +typedef BalloonReqStatus (QEMUBalloonGetFreePage)(void *opaque,
> +                                                  unsigned long *bitmap,
> +                                                  unsigned long len);
>  
>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>                               QEMUBalloonStatus *stat_func,
>                               QEMUBalloonDropCache *drop_cache_func,
> +                             QEMUBalloonGetFreePage *get_free_page_func,
>                               void *opaque);
>  void qemu_remove_balloon_handler(void *opaque);
>  bool qemu_balloon_is_inhibited(void);
>  void qemu_balloon_inhibit(bool state);
> +bool balloon_free_pages_support(void);
> +BalloonReqStatus balloon_get_free_pages(unsigned long *bitmap,
> +                                        unsigned long len);
>  
>  #endif
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 6/7] kvm: Add two new arch specific functions
  2016-06-13 10:16   ` [Qemu-devel] " Liang Li
@ 2016-06-19  4:27     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:27 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:48PM +0800, Liang Li wrote:
> Add a new function to get the vm's max pfn and a new function
> to filter out the holes to get a tight free page bitmap.
> They are implemented on X86, and all the arches should implement
> them for live migration optimization.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  include/sysemu/kvm.h |  2 ++
>  target-arm/kvm.c     | 14 ++++++++++++++
>  target-i386/kvm.c    | 35 +++++++++++++++++++++++++++++++++++
>  target-mips/kvm.c    | 14 ++++++++++++++
>  target-ppc/kvm.c     | 14 ++++++++++++++
>  target-s390x/kvm.c   | 14 ++++++++++++++
>  6 files changed, 93 insertions(+)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index ad6f837..50915f9 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -230,6 +230,8 @@ int kvm_remove_breakpoint(CPUState *cpu, target_ulong addr,
>                            target_ulong len, int type);
>  void kvm_remove_all_breakpoints(CPUState *cpu);
>  int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap);
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap);
> +unsigned long get_guest_max_pfn(void);
>  #ifndef _WIN32
>  int kvm_set_signal_mask(CPUState *cpu, const sigset_t *sigset);
>  #endif
> diff --git a/target-arm/kvm.c b/target-arm/kvm.c
> index 83da447..6464542 100644
> --- a/target-arm/kvm.c
> +++ b/target-arm/kvm.c
> @@ -627,3 +627,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      return (data - 32) & 0xffff;
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index abf50e6..0b394cb 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -3327,3 +3327,38 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      abort();
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    PCMachineState *pcms = PC_MACHINE(current_machine);
> +    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
> +    unsigned long max_pfn;
> +
> +    if (above_4g_mem) {
> +        max_pfn = ((1ULL << 32) + above_4g_mem) >> TARGET_PAGE_BITS;
> +    } else {
> +        max_pfn = pcms->below_4g_mem_size >> TARGET_PAGE_BITS;
> +    }
> +
> +    return max_pfn;
> +}

Why is this in kvm?

> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    PCMachineState *pcms = PC_MACHINE(current_machine);
> +    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
> +
> +    if (above_4g_mem) {
> +        unsigned long *src, *dst, len, pos;
> +        ram_addr_t below_4g_mem = pcms->below_4g_mem_size;
> +        src = bmap + ((1ULL << 32) >> TARGET_PAGE_BITS) / BITS_PER_LONG;
> +        dst = bmap + (below_4g_mem >> TARGET_PAGE_BITS) / BITS_PER_LONG;
> +        bitmap_move(dst, src, above_4g_mem >> TARGET_PAGE_BITS);
> +
> +        pos = (above_4g_mem + below_4g_mem) >> TARGET_PAGE_BITS;
> +        len = ((1ULL << 32) - below_4g_mem) >> TARGET_PAGE_BITS;
> +        bitmap_clear(bmap, pos, len);
> +    }
> +
> +    return bmap;
> +}

what does this do? External APIs should have documentation.

> diff --git a/target-mips/kvm.c b/target-mips/kvm.c
> index a854e4d..89a54e5 100644
> --- a/target-mips/kvm.c
> +++ b/target-mips/kvm.c
> @@ -1048,3 +1048,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      abort();
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 24d6032..e222b31 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -2579,3 +2579,17 @@ int kvmppc_enable_hwrng(void)
>  
>      return kvmppc_enable_hcall(kvm_state, H_RANDOM);
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
> index 8f46fd0..893755b 100644
> --- a/target-s390x/kvm.c
> +++ b/target-s390x/kvm.c
> @@ -2271,3 +2271,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      abort();
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 6/7] kvm: Add two new arch specific functions
@ 2016-06-19  4:27     ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:27 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:48PM +0800, Liang Li wrote:
> Add a new function to get the vm's max pfn and a new function
> to filter out the holes to get a tight free page bitmap.
> They are implemented on X86, and all the arches should implement
> them for live migration optimization.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  include/sysemu/kvm.h |  2 ++
>  target-arm/kvm.c     | 14 ++++++++++++++
>  target-i386/kvm.c    | 35 +++++++++++++++++++++++++++++++++++
>  target-mips/kvm.c    | 14 ++++++++++++++
>  target-ppc/kvm.c     | 14 ++++++++++++++
>  target-s390x/kvm.c   | 14 ++++++++++++++
>  6 files changed, 93 insertions(+)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index ad6f837..50915f9 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -230,6 +230,8 @@ int kvm_remove_breakpoint(CPUState *cpu, target_ulong addr,
>                            target_ulong len, int type);
>  void kvm_remove_all_breakpoints(CPUState *cpu);
>  int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_trap);
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap);
> +unsigned long get_guest_max_pfn(void);
>  #ifndef _WIN32
>  int kvm_set_signal_mask(CPUState *cpu, const sigset_t *sigset);
>  #endif
> diff --git a/target-arm/kvm.c b/target-arm/kvm.c
> index 83da447..6464542 100644
> --- a/target-arm/kvm.c
> +++ b/target-arm/kvm.c
> @@ -627,3 +627,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      return (data - 32) & 0xffff;
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index abf50e6..0b394cb 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -3327,3 +3327,38 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      abort();
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    PCMachineState *pcms = PC_MACHINE(current_machine);
> +    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
> +    unsigned long max_pfn;
> +
> +    if (above_4g_mem) {
> +        max_pfn = ((1ULL << 32) + above_4g_mem) >> TARGET_PAGE_BITS;
> +    } else {
> +        max_pfn = pcms->below_4g_mem_size >> TARGET_PAGE_BITS;
> +    }
> +
> +    return max_pfn;
> +}

Why is this in kvm?

> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    PCMachineState *pcms = PC_MACHINE(current_machine);
> +    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
> +
> +    if (above_4g_mem) {
> +        unsigned long *src, *dst, len, pos;
> +        ram_addr_t below_4g_mem = pcms->below_4g_mem_size;
> +        src = bmap + ((1ULL << 32) >> TARGET_PAGE_BITS) / BITS_PER_LONG;
> +        dst = bmap + (below_4g_mem >> TARGET_PAGE_BITS) / BITS_PER_LONG;
> +        bitmap_move(dst, src, above_4g_mem >> TARGET_PAGE_BITS);
> +
> +        pos = (above_4g_mem + below_4g_mem) >> TARGET_PAGE_BITS;
> +        len = ((1ULL << 32) - below_4g_mem) >> TARGET_PAGE_BITS;
> +        bitmap_clear(bmap, pos, len);
> +    }
> +
> +    return bmap;
> +}

what does this do? External APIs should have documentation.

> diff --git a/target-mips/kvm.c b/target-mips/kvm.c
> index a854e4d..89a54e5 100644
> --- a/target-mips/kvm.c
> +++ b/target-mips/kvm.c
> @@ -1048,3 +1048,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      abort();
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 24d6032..e222b31 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -2579,3 +2579,17 @@ int kvmppc_enable_hwrng(void)
>  
>      return kvmppc_enable_hcall(kvm_state, H_RANDOM);
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
> index 8f46fd0..893755b 100644
> --- a/target-s390x/kvm.c
> +++ b/target-s390x/kvm.c
> @@ -2271,3 +2271,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      abort();
>  }
> +
> +unsigned long get_guest_max_pfn(void)
> +{
> +    /* To be done */
> +
> +    return 0;
> +}
> +
> +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap)
> +{
> +    /* To be done */
> +
> +    return bmap;
> +}
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 7/7] migration: skip free pages during live migration
  2016-06-13 10:16   ` [Qemu-devel] " Liang Li
@ 2016-06-19  4:43     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:43 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:49PM +0800, Liang Li wrote:
> After sending out the request for free pages, live migration
> process will start without waiting for the free page bitmap is
> ready. If the free page bitmap is not ready when doing the 1st
> migration_bitmap_sync() after ram_save_setup(), the free page
> bitmap will be ignored, this means the free pages will not be
> filtered out in this case.
> The current implementation can not work with post copy, if post
> copy is enabled, we simply ignore the free pages. Will make it
> work later.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>

Tying migration to balloon in this way seems rather ugly.
So with request ID, the logic would basically be

	- add memory listener with high priority
	- before sync bitmap, increment request id
	- when we get response, if it has latest request id,
	  clear qemu migration bitmap
	  otherwise, ignore


> ---
>  migration/ram.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 93 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 844ea46..5f1c3ff 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -43,6 +43,8 @@
>  #include "trace.h"
>  #include "exec/ram_addr.h"
>  #include "qemu/rcu_queue.h"
> +#include "sysemu/balloon.h"
> +#include "sysemu/kvm.h"
>  
>  #ifdef DEBUG_MIGRATION_RAM
>  #define DPRINTF(fmt, ...) \
> @@ -228,6 +230,7 @@ static QemuMutex migration_bitmap_mutex;
>  static uint64_t migration_dirty_pages;
>  static uint32_t last_version;
>  static bool ram_bulk_stage;
> +static bool ignore_freepage_rsp;
>  
>  /* used by the search for pages to send */
>  struct PageSearchStatus {
> @@ -244,6 +247,7 @@ static struct BitmapRcu {
>      struct rcu_head rcu;
>      /* Main migration bitmap */
>      unsigned long *bmap;
> +    unsigned long *free_page_bmap;
>      /* bitmap of pages that haven't been sent even once
>       * only maintained and used in postcopy at the moment
>       * where it's used to send the dirtymap at the start
> @@ -639,6 +643,7 @@ static void migration_bitmap_sync(void)
>      rcu_read_unlock();
>      qemu_mutex_unlock(&migration_bitmap_mutex);
>  
> +    ignore_freepage_rsp = true;
>      trace_migration_bitmap_sync_end(migration_dirty_pages
>                                      - num_dirty_pages_init);
>      num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
> @@ -1417,6 +1422,7 @@ static void migration_bitmap_free(struct BitmapRcu *bmap)
>  {
>      g_free(bmap->bmap);
>      g_free(bmap->unsentmap);
> +    g_free(bmap->free_page_bmap);
>      g_free(bmap);
>  }
>  
> @@ -1487,6 +1493,85 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
>      }
>  }
>  
> +static void filter_out_guest_free_page(unsigned long *free_page_bmap,
> +                                       long nbits)
> +{
> +    long i, page_count = 0, len;
> +    unsigned long *bitmap;
> +
> +    tighten_guest_free_page_bmap(free_page_bmap);
> +    qemu_mutex_lock(&migration_bitmap_mutex);
> +    bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
> +    slow_bitmap_complement(bitmap, free_page_bmap, nbits);
> +
> +    len = (last_ram_offset() >> TARGET_PAGE_BITS) / BITS_PER_LONG;
> +    for (i = 0; i < len; i++) {
> +        page_count += hweight_long(bitmap[i]);
> +    }
> +
> +    migration_dirty_pages = page_count;
> +    qemu_mutex_unlock(&migration_bitmap_mutex);
> +}
> +
> +static void ram_request_free_page(unsigned long *bmap, unsigned long max_pfn)
> +{
> +    BalloonReqStatus status;
> +
> +    status = balloon_get_free_pages(bmap, max_pfn);
> +    switch (status) {
> +    case REQ_DONE:
> +        ignore_freepage_rsp = false;
> +        break;
> +    case REQ_ERROR:
> +        error_report("Errro happend when request free page");
> +        break;
> +    default:
> +        error_report("unexpected response status: %d", status);
> +        break;
> +    }
> +}
> +
> +static void ram_handle_free_page(void)
> +{
> +    unsigned long nbits;
> +    RAMBlock *pc_ram_block;
> +    BalloonReqStatus status;
> +
> +    status = balloon_get_free_pages(migration_bitmap_rcu->free_page_bmap,
> +                                    get_guest_max_pfn());
> +    switch (status) {
> +    case REQ_DONE:
> +        rcu_read_lock();
> +        pc_ram_block = QLIST_FIRST_RCU(&ram_list.blocks);
> +        nbits = pc_ram_block->used_length >> TARGET_PAGE_BITS;
> +        filter_out_guest_free_page(migration_bitmap_rcu->free_page_bmap, nbits);
> +        rcu_read_unlock();
> +
> +        qemu_mutex_lock_iothread();
> +        migration_bitmap_sync();
> +        qemu_mutex_unlock_iothread();
> +        /*
> +         * bulk stage assumes in (migration_bitmap_find_and_reset_dirty) that
> +         * every page is dirty, that's no longer ture at this point.
> +         */
> +        ram_bulk_stage = false;
> +        last_seen_block = NULL;
> +        last_sent_block = NULL;
> +        last_offset = 0;
> +        break;
> +    case REQ_ERROR:
> +        ignore_freepage_rsp = true;
> +        error_report("failed to get free page");
> +        break;
> +    case REQ_INVALID_PARAM:
> +        ignore_freepage_rsp = true;
> +        error_report("buffer overflow");
> +        break;
> +    default:
> +        break;
> +    }
> +}
> +
>  /*
>   * 'expected' is the value you expect the bitmap mostly to be full
>   * of; it won't bother printing lines that are all this value.
> @@ -1950,6 +2035,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      qemu_mutex_unlock_ramlist();
>      qemu_mutex_unlock_iothread();
>  
> +    if (balloon_free_pages_support() && !migrate_postcopy_ram()) {
> +        unsigned long max_pfn = get_guest_max_pfn();
> +        migration_bitmap_rcu->free_page_bmap = bitmap_new(max_pfn);
> +        ram_request_free_page(migration_bitmap_rcu->free_page_bmap, max_pfn);
> +    }
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> @@ -1990,6 +2080,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      while ((ret = qemu_file_rate_limit(f)) == 0) {
>          int pages;
>  
> +        if (!ignore_freepage_rsp) {
> +            ram_handle_free_page();
> +        }
>          pages = ram_find_and_save_block(f, false, &bytes_transferred);
>          /* no more pages to sent */
>          if (pages == 0) {
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 7/7] migration: skip free pages during live migration
@ 2016-06-19  4:43     ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2016-06-19  4:43 UTC (permalink / raw)
  To: Liang Li
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

On Mon, Jun 13, 2016 at 06:16:49PM +0800, Liang Li wrote:
> After sending out the request for free pages, live migration
> process will start without waiting for the free page bitmap is
> ready. If the free page bitmap is not ready when doing the 1st
> migration_bitmap_sync() after ram_save_setup(), the free page
> bitmap will be ignored, this means the free pages will not be
> filtered out in this case.
> The current implementation can not work with post copy, if post
> copy is enabled, we simply ignore the free pages. Will make it
> work later.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>

Tying migration to balloon in this way seems rather ugly.
So with request ID, the logic would basically be

	- add memory listener with high priority
	- before sync bitmap, increment request id
	- when we get response, if it has latest request id,
	  clear qemu migration bitmap
	  otherwise, ignore


> ---
>  migration/ram.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 93 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 844ea46..5f1c3ff 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -43,6 +43,8 @@
>  #include "trace.h"
>  #include "exec/ram_addr.h"
>  #include "qemu/rcu_queue.h"
> +#include "sysemu/balloon.h"
> +#include "sysemu/kvm.h"
>  
>  #ifdef DEBUG_MIGRATION_RAM
>  #define DPRINTF(fmt, ...) \
> @@ -228,6 +230,7 @@ static QemuMutex migration_bitmap_mutex;
>  static uint64_t migration_dirty_pages;
>  static uint32_t last_version;
>  static bool ram_bulk_stage;
> +static bool ignore_freepage_rsp;
>  
>  /* used by the search for pages to send */
>  struct PageSearchStatus {
> @@ -244,6 +247,7 @@ static struct BitmapRcu {
>      struct rcu_head rcu;
>      /* Main migration bitmap */
>      unsigned long *bmap;
> +    unsigned long *free_page_bmap;
>      /* bitmap of pages that haven't been sent even once
>       * only maintained and used in postcopy at the moment
>       * where it's used to send the dirtymap at the start
> @@ -639,6 +643,7 @@ static void migration_bitmap_sync(void)
>      rcu_read_unlock();
>      qemu_mutex_unlock(&migration_bitmap_mutex);
>  
> +    ignore_freepage_rsp = true;
>      trace_migration_bitmap_sync_end(migration_dirty_pages
>                                      - num_dirty_pages_init);
>      num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
> @@ -1417,6 +1422,7 @@ static void migration_bitmap_free(struct BitmapRcu *bmap)
>  {
>      g_free(bmap->bmap);
>      g_free(bmap->unsentmap);
> +    g_free(bmap->free_page_bmap);
>      g_free(bmap);
>  }
>  
> @@ -1487,6 +1493,85 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
>      }
>  }
>  
> +static void filter_out_guest_free_page(unsigned long *free_page_bmap,
> +                                       long nbits)
> +{
> +    long i, page_count = 0, len;
> +    unsigned long *bitmap;
> +
> +    tighten_guest_free_page_bmap(free_page_bmap);
> +    qemu_mutex_lock(&migration_bitmap_mutex);
> +    bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
> +    slow_bitmap_complement(bitmap, free_page_bmap, nbits);
> +
> +    len = (last_ram_offset() >> TARGET_PAGE_BITS) / BITS_PER_LONG;
> +    for (i = 0; i < len; i++) {
> +        page_count += hweight_long(bitmap[i]);
> +    }
> +
> +    migration_dirty_pages = page_count;
> +    qemu_mutex_unlock(&migration_bitmap_mutex);
> +}
> +
> +static void ram_request_free_page(unsigned long *bmap, unsigned long max_pfn)
> +{
> +    BalloonReqStatus status;
> +
> +    status = balloon_get_free_pages(bmap, max_pfn);
> +    switch (status) {
> +    case REQ_DONE:
> +        ignore_freepage_rsp = false;
> +        break;
> +    case REQ_ERROR:
> +        error_report("Errro happend when request free page");
> +        break;
> +    default:
> +        error_report("unexpected response status: %d", status);
> +        break;
> +    }
> +}
> +
> +static void ram_handle_free_page(void)
> +{
> +    unsigned long nbits;
> +    RAMBlock *pc_ram_block;
> +    BalloonReqStatus status;
> +
> +    status = balloon_get_free_pages(migration_bitmap_rcu->free_page_bmap,
> +                                    get_guest_max_pfn());
> +    switch (status) {
> +    case REQ_DONE:
> +        rcu_read_lock();
> +        pc_ram_block = QLIST_FIRST_RCU(&ram_list.blocks);
> +        nbits = pc_ram_block->used_length >> TARGET_PAGE_BITS;
> +        filter_out_guest_free_page(migration_bitmap_rcu->free_page_bmap, nbits);
> +        rcu_read_unlock();
> +
> +        qemu_mutex_lock_iothread();
> +        migration_bitmap_sync();
> +        qemu_mutex_unlock_iothread();
> +        /*
> +         * bulk stage assumes in (migration_bitmap_find_and_reset_dirty) that
> +         * every page is dirty, that's no longer ture at this point.
> +         */
> +        ram_bulk_stage = false;
> +        last_seen_block = NULL;
> +        last_sent_block = NULL;
> +        last_offset = 0;
> +        break;
> +    case REQ_ERROR:
> +        ignore_freepage_rsp = true;
> +        error_report("failed to get free page");
> +        break;
> +    case REQ_INVALID_PARAM:
> +        ignore_freepage_rsp = true;
> +        error_report("buffer overflow");
> +        break;
> +    default:
> +        break;
> +    }
> +}
> +
>  /*
>   * 'expected' is the value you expect the bitmap mostly to be full
>   * of; it won't bother printing lines that are all this value.
> @@ -1950,6 +2035,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      qemu_mutex_unlock_ramlist();
>      qemu_mutex_unlock_iothread();
>  
> +    if (balloon_free_pages_support() && !migrate_postcopy_ram()) {
> +        unsigned long max_pfn = get_guest_max_pfn();
> +        migration_bitmap_rcu->free_page_bmap = bitmap_new(max_pfn);
> +        ram_request_free_page(migration_bitmap_rcu->free_page_bmap, max_pfn);
> +    }
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> @@ -1990,6 +2080,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      while ((ret = qemu_file_rate_limit(f)) == 0) {
>          int pages;
>  
> +        if (!ignore_freepage_rsp) {
> +            ram_handle_free_page();
> +        }
>          pages = ram_find_and_save_block(f, false, &bytes_transferred);
>          /* no more pages to sent */
>          if (pages == 0) {
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [QEMU 1/7] balloon: speed up inflating & deflating process
  2016-06-19  4:12     ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-06-20  1:37       ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  1:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> >
> >          virtqueue_push(vq, elem, offset); @@ -374,6 +489,7 @@ static
> > uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
> >      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
> >      f |= dev->host_features;
> >      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
> > +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
> >      return f;
> >  }
> >
> 
> Pls add features to virtio_balloon_properties.
> You also need to handle compatibility by disabling for old machine types.
> 

I forgot that, will add in next version.

> > --- a/include/standard-headers/linux/virtio_balloon.h
> > +++ b/include/standard-headers/linux/virtio_balloon.h
> > @@ -34,6 +34,7 @@
> >  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before
> reclaiming pages */
> >  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue
> */
> >  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon
> on OOM */
> > +#define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to
> send
> > +page info */
> >
> >  /* Size of a PFN in the balloon interface. */  #define
> > VIRTIO_BALLOON_PFN_SHIFT 12
> 
> We want to keep this in sync with Linux.
> Let's get a minimal patch to extend this header merged in linux, then update
> this one.

OK. Can this be independent of the virtio-balloon SPEC? As I understand it,
 it will not get merged before the SPEC is set?

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 1/7] balloon: speed up inflating & deflating process
@ 2016-06-20  1:37       ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  1:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> >
> >          virtqueue_push(vq, elem, offset); @@ -374,6 +489,7 @@ static
> > uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
> >      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
> >      f |= dev->host_features;
> >      virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
> > +    virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_BITMAP);
> >      return f;
> >  }
> >
> 
> Pls add features to virtio_balloon_properties.
> You also need to handle compatibility by disabling for old machine types.
> 

I forgot that, will add in next version.

> > --- a/include/standard-headers/linux/virtio_balloon.h
> > +++ b/include/standard-headers/linux/virtio_balloon.h
> > @@ -34,6 +34,7 @@
> >  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before
> reclaiming pages */
> >  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue
> */
> >  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon
> on OOM */
> > +#define VIRTIO_BALLOON_F_PAGE_BITMAP  3 /* Use page bitmap to
> send
> > +page info */
> >
> >  /* Size of a PFN in the balloon interface. */  #define
> > VIRTIO_BALLOON_PFN_SHIFT 12
> 
> We want to keep this in sync with Linux.
> Let's get a minimal patch to extend this header merged in linux, then update
> this one.

OK. Can this be independent of the virtio-balloon SPEC? As I understand it,
 it will not get merged before the SPEC is set?

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 2/7] virtio-balloon: add drop cache support
  2016-06-19  4:14     ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-06-20  2:09       ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  2:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, quintela, qemu-devel, lcapitulino, amit.shah, pbonzini, dgilbert

> On Mon, Jun 13, 2016 at 06:16:44PM +0800, Liang Li wrote:
> > virtio-balloon can make use of the amount of free memory to determine
> > the amount of memory to be filled in the balloon, but the amount of
> > free memory will be effected by the page cache, which can be reclaimed.
> > Drop the cache before getting the amount of free memory will be very
> > helpful to relect the exact amount of memroy that can be reclaimed.
> 
> Can't we just extend stats to report "reclaimable" memory?
> 

Yes, I noticed the VIRTIO_BALLOON_S_AVAIL is for this purpose.  

I summarized the possible solutions from others:
a. Drop the cache in guest agent instead of an obvious qmp command. (Paolo) 
b. Use a parameter as a hint to tell the guest live migration is going to happen, and let the guest do what it can do to make the host's life easier.  (David)

What' your opinion about these two solutions?

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 2/7] virtio-balloon: add drop cache support
@ 2016-06-20  2:09       ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  2:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> On Mon, Jun 13, 2016 at 06:16:44PM +0800, Liang Li wrote:
> > virtio-balloon can make use of the amount of free memory to determine
> > the amount of memory to be filled in the balloon, but the amount of
> > free memory will be effected by the page cache, which can be reclaimed.
> > Drop the cache before getting the amount of free memory will be very
> > helpful to relect the exact amount of memroy that can be reclaimed.
> 
> Can't we just extend stats to report "reclaimable" memory?
> 

Yes, I noticed the VIRTIO_BALLOON_S_AVAIL is for this purpose.  

I summarized the possible solutions from others:
a. Drop the cache in guest agent instead of an obvious qmp command. (Paolo) 
b. Use a parameter as a hint to tell the guest live migration is going to happen, and let the guest do what it can do to make the host's life easier.  (David)

What' your opinion about these two solutions?

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [QEMU 4/7] balloon: get free page info from guest
  2016-06-19  4:24     ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-06-20  2:48       ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  2:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> On Mon, Jun 13, 2016 at 06:16:46PM +0800, Liang Li wrote:
> > Add a new feature to get the free page information from guest, the
> > free page information is saved in a bitmap. Please note that 'free
> > page' only means these pages are free before the request, some of the
> > pages will become no free during the process of sending the free page
> > bitmap to QEMU.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> 
> 
> I don't much like this request interface.
> The meaning of free page is rather fuzzy too - so at what point are they free?
> 
> 
> My suggestion would be:
> 	report free page request ID to guest
> 	include request ID when guest sends free page list
> 
> the definition is then:
> 	page was free sometime after host set this value of request
> 	ID and before it received response with the same ID

That's better. I will change in next version.
And there is another issue similar as we solved to speed up the inflating/deflating process.
Should we use a large page bitmap or a small one ? I used a big one in the patch.

If we chose to use a small page bitmap, then we have to traverse the free page list for many times,
and the meaning of free page will be more fuzzy.

But if we use a big map bitmap, people may ask, why a small one here and a big one there?

Thanks!

Liang 




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 4/7] balloon: get free page info from guest
@ 2016-06-20  2:48       ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  2:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> On Mon, Jun 13, 2016 at 06:16:46PM +0800, Liang Li wrote:
> > Add a new feature to get the free page information from guest, the
> > free page information is saved in a bitmap. Please note that 'free
> > page' only means these pages are free before the request, some of the
> > pages will become no free during the process of sending the free page
> > bitmap to QEMU.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> 
> 
> I don't much like this request interface.
> The meaning of free page is rather fuzzy too - so at what point are they free?
> 
> 
> My suggestion would be:
> 	report free page request ID to guest
> 	include request ID when guest sends free page list
> 
> the definition is then:
> 	page was free sometime after host set this value of request
> 	ID and before it received response with the same ID

That's better. I will change in next version.
And there is another issue similar as we solved to speed up the inflating/deflating process.
Should we use a large page bitmap or a small one ? I used a big one in the patch.

If we chose to use a small page bitmap, then we have to traverse the free page list for many times,
and the meaning of free page will be more fuzzy.

But if we use a big map bitmap, people may ask, why a small one here and a big one there?

Thanks!

Liang 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [QEMU 7/7] migration: skip free pages during live migration
  2016-06-19  4:43     ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-06-20  2:52       ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  2:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, quintela, qemu-devel, lcapitulino, amit.shah, pbonzini, dgilbert

> On Mon, Jun 13, 2016 at 06:16:49PM +0800, Liang Li wrote:
> > After sending out the request for free pages, live migration process
> > will start without waiting for the free page bitmap is ready. If the
> > free page bitmap is not ready when doing the 1st
> > migration_bitmap_sync() after ram_save_setup(), the free page bitmap
> > will be ignored, this means the free pages will not be filtered out in
> > this case.
> > The current implementation can not work with post copy, if post copy
> > is enabled, we simply ignore the free pages. Will make it work later.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> 
> Tying migration to balloon in this way seems rather ugly.
> So with request ID, the logic would basically be
> 
> 	- add memory listener with high priority
> 	- before sync bitmap, increment request id
> 	- when we get response, if it has latest request id,
> 	  clear qemu migration bitmap
> 	  otherwise, ignore

Use the request ID is good. 
Could you elaborate the meaning of ' add memory listener with high priority ' ?

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 7/7] migration: skip free pages during live migration
@ 2016-06-20  2:52       ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  2:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> On Mon, Jun 13, 2016 at 06:16:49PM +0800, Liang Li wrote:
> > After sending out the request for free pages, live migration process
> > will start without waiting for the free page bitmap is ready. If the
> > free page bitmap is not ready when doing the 1st
> > migration_bitmap_sync() after ram_save_setup(), the free page bitmap
> > will be ignored, this means the free pages will not be filtered out in
> > this case.
> > The current implementation can not work with post copy, if post copy
> > is enabled, we simply ignore the free pages. Will make it work later.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> 
> Tying migration to balloon in this way seems rather ugly.
> So with request ID, the logic would basically be
> 
> 	- add memory listener with high priority
> 	- before sync bitmap, increment request id
> 	- when we get response, if it has latest request id,
> 	  clear qemu migration bitmap
> 	  otherwise, ignore

Use the request ID is good. 
Could you elaborate the meaning of ' add memory listener with high priority ' ?

Thanks!
Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [QEMU 6/7] kvm: Add two new arch specific functions
  2016-06-19  4:27     ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-06-20  3:16       ` Li, Liang Z
  -1 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  3:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> > --- a/target-arm/kvm.c
> > +++ b/target-arm/kvm.c
> > @@ -627,3 +627,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)  {
> >      return (data - 32) & 0xffff;
> >  }
> > +
> > +unsigned long get_guest_max_pfn(void) {
> > +    /* To be done */
> > +
> > +    return 0;
> > +}
> > +
> > +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap) {
> > +    /* To be done */
> > +
> > +    return bmap;
> > +}
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c index
> > abf50e6..0b394cb 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -3327,3 +3327,38 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)  {
> >      abort();
> >  }
> > +
> > +unsigned long get_guest_max_pfn(void) {
> > +    PCMachineState *pcms = PC_MACHINE(current_machine);
> > +    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
> > +    unsigned long max_pfn;
> > +
> > +    if (above_4g_mem) {
> > +        max_pfn = ((1ULL << 32) + above_4g_mem) >> TARGET_PAGE_BITS;
> > +    } else {
> > +        max_pfn = pcms->below_4g_mem_size >> TARGET_PAGE_BITS;
> > +    }
> > +
> > +    return max_pfn;
> > +}
> 
> Why is this in kvm?

I can't find a better place. Do you have any suggestion? 

> > +        pos = (above_4g_mem + below_4g_mem) >> TARGET_PAGE_BITS;
> > +        len = ((1ULL << 32) - below_4g_mem) >> TARGET_PAGE_BITS;
> > +        bitmap_clear(bmap, pos, len);
> > +    }
> > +
> > +    return bmap;
> > +}
> 
> what does this do? External APIs should have documentation.

I will add the documentation. Thanks!

Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [QEMU 6/7] kvm: Add two new arch specific functions
@ 2016-06-20  3:16       ` Li, Liang Z
  0 siblings, 0 replies; 60+ messages in thread
From: Li, Liang Z @ 2016-06-20  3:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, kvm, lcapitulino, pbonzini, quintela, amit.shah, dgilbert

> > --- a/target-arm/kvm.c
> > +++ b/target-arm/kvm.c
> > @@ -627,3 +627,17 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)  {
> >      return (data - 32) & 0xffff;
> >  }
> > +
> > +unsigned long get_guest_max_pfn(void) {
> > +    /* To be done */
> > +
> > +    return 0;
> > +}
> > +
> > +unsigned long *tighten_guest_free_page_bmap(unsigned long *bmap) {
> > +    /* To be done */
> > +
> > +    return bmap;
> > +}
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c index
> > abf50e6..0b394cb 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -3327,3 +3327,38 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)  {
> >      abort();
> >  }
> > +
> > +unsigned long get_guest_max_pfn(void) {
> > +    PCMachineState *pcms = PC_MACHINE(current_machine);
> > +    ram_addr_t above_4g_mem = pcms->above_4g_mem_size;
> > +    unsigned long max_pfn;
> > +
> > +    if (above_4g_mem) {
> > +        max_pfn = ((1ULL << 32) + above_4g_mem) >> TARGET_PAGE_BITS;
> > +    } else {
> > +        max_pfn = pcms->below_4g_mem_size >> TARGET_PAGE_BITS;
> > +    }
> > +
> > +    return max_pfn;
> > +}
> 
> Why is this in kvm?

I can't find a better place. Do you have any suggestion? 

> > +        pos = (above_4g_mem + below_4g_mem) >> TARGET_PAGE_BITS;
> > +        len = ((1ULL << 32) - below_4g_mem) >> TARGET_PAGE_BITS;
> > +        bitmap_clear(bmap, pos, len);
> > +    }
> > +
> > +    return bmap;
> > +}
> 
> what does this do? External APIs should have documentation.

I will add the documentation. Thanks!

Liang

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2016-06-20  3:16 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-13 10:16 [QEMU 0/7] Fast balloon and fast live migration Liang Li
2016-06-13 10:16 ` [Qemu-devel] " Liang Li
2016-06-13 10:16 ` [QEMU 1/7] balloon: speed up inflating & deflating process Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-14 11:37   ` Thomas Huth
2016-06-14 11:37     ` [Qemu-devel] " Thomas Huth
2016-06-14 14:22     ` Li, Liang Z
2016-06-14 14:22       ` [Qemu-devel] " Li, Liang Z
2016-06-14 14:41       ` Li, Liang Z
2016-06-14 14:41         ` [Qemu-devel] " Li, Liang Z
2016-06-14 15:33         ` Thomas Huth
2016-06-14 15:33           ` [Qemu-devel] " Thomas Huth
2016-06-17  0:54           ` Li, Liang Z
2016-06-17  0:54             ` [Qemu-devel] " Li, Liang Z
2016-06-19  4:12   ` Michael S. Tsirkin
2016-06-19  4:12     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  1:37     ` Li, Liang Z
2016-06-20  1:37       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 2/7] virtio-balloon: add drop cache support Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:14   ` Michael S. Tsirkin
2016-06-19  4:14     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  2:09     ` Li, Liang Z
2016-06-20  2:09       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 3/7] Add the hmp and qmp interface for dropping cache Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-13 10:50   ` Daniel P. Berrange
2016-06-13 11:06     ` Daniel P. Berrange
2016-06-13 14:12       ` Li, Liang Z
2016-06-13 14:12         ` Li, Liang Z
2016-06-13 11:41     ` Paolo Bonzini
2016-06-13 14:14       ` Li, Liang Z
2016-06-13 14:14         ` Li, Liang Z
2016-06-13 13:50     ` Li, Liang Z
2016-06-13 13:50       ` Li, Liang Z
2016-06-13 15:09       ` Dr. David Alan Gilbert
2016-06-14  1:15         ` Li, Liang Z
2016-06-14  1:15           ` Li, Liang Z
2016-06-17  1:35         ` Li, Liang Z
2016-06-17  1:35           ` Li, Liang Z
2016-06-13 10:16 ` [QEMU 4/7] balloon: get free page info from guest Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:24   ` Michael S. Tsirkin
2016-06-19  4:24     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  2:48     ` Li, Liang Z
2016-06-20  2:48       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 5/7] bitmap: Add a new bitmap_move function Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-13 10:16 ` [QEMU 6/7] kvm: Add two new arch specific functions Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:27   ` Michael S. Tsirkin
2016-06-19  4:27     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  3:16     ` Li, Liang Z
2016-06-20  3:16       ` [Qemu-devel] " Li, Liang Z
2016-06-13 10:16 ` [QEMU 7/7] migration: skip free pages during live migration Liang Li
2016-06-13 10:16   ` [Qemu-devel] " Liang Li
2016-06-19  4:43   ` Michael S. Tsirkin
2016-06-19  4:43     ` [Qemu-devel] " Michael S. Tsirkin
2016-06-20  2:52     ` Li, Liang Z
2016-06-20  2:52       ` [Qemu-devel] " Li, Liang Z

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.