qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v1 0/3] virtio-balloon: PartialBalloonedPage rework
@ 2019-07-19 16:01 David Hildenbrand
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp David Hildenbrand
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: David Hildenbrand @ 2019-07-19 16:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, David Gibson, Michael S . Tsirkin,
	Stefan Hajnoczi, David Hildenbrand

Michael pointed out that stroing and using the address of a RAMBlock
might not be safe. So let's rework the pbp handling, cleaning up the
code.

Did a sanity test with hugepage backing on x86.64.

We might want to have this in 4.1. I'll let Michael decide.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>

David Hildenbrand (3):
  virtio-balloon: simplify deflate with pbp
  virtio-balloon: Better names for offset variables in inflate/deflate
    code
  virtio-balloon: Rework pbp tracking data

 hw/virtio/virtio-balloon.c         | 90 ++++++++++++++----------------
 include/hw/virtio/virtio-balloon.h | 15 ++++-
 2 files changed, 53 insertions(+), 52 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp
  2019-07-19 16:01 [Qemu-devel] [PATCH v1 0/3] virtio-balloon: PartialBalloonedPage rework David Hildenbrand
@ 2019-07-19 16:01 ` David Hildenbrand
  2019-07-22  2:27   ` David Gibson
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 2/3] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data David Hildenbrand
  2 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand @ 2019-07-19 16:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, David Gibson, Michael S . Tsirkin,
	Stefan Hajnoczi, David Hildenbrand

Let's simplify this - the case we are optimizing for is very hard to
trigger and not worth the effort. If we're switching from inflation to
deflation, let's reset the pbp.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 24 ++++--------------------
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 9de3c030bf..287d5d4c97 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -121,7 +121,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     void *addr = memory_region_get_ram_ptr(mr) + offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset, host_page_base;
+    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
@@ -129,26 +129,10 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
      * host address? */
     rb = qemu_ram_block_from_host(addr, false, &ram_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
-
-    if (balloon->pbp
-        && rb == balloon->pbp->rb
-        && host_page_base == balloon->pbp->base) {
-        int subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
-        /*
-         * This means the guest has asked to discard some of the 4kiB
-         * subpages of a host page, but then changed its mind and
-         * asked to keep them after all.  It's exceedingly unlikely
-         * for a guest to do this in practice, but handle it anyway,
-         * since getting it wrong could mean discarding memory the
-         * guest is still using. */
-        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-                  balloon->pbp->bitmap);
-
-        if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
-            virtio_balloon_reset_pbp(balloon);
-        }
+    if (balloon->pbp) {
+        /* Let's play safe and always reset the pbp on deflation requests. */
+        virtio_balloon_reset_pbp(balloon);
     }
 
     host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v1 2/3] virtio-balloon: Better names for offset variables in inflate/deflate code
  2019-07-19 16:01 [Qemu-devel] [PATCH v1 0/3] virtio-balloon: PartialBalloonedPage rework David Hildenbrand
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp David Hildenbrand
@ 2019-07-19 16:01 ` David Hildenbrand
  2019-07-22  2:29   ` David Gibson
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data David Hildenbrand
  2 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand @ 2019-07-19 16:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, David Gibson, Michael S . Tsirkin,
	Stefan Hajnoczi, David Hildenbrand

"host_page_base" is really confusing, let's make this clearer, also
rename the other offsets to indicate to which base they apply.

offset -> mr_offset
ram_offset -> rb_offset
host_page_base -> rb_aligned_offset

While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
and move the computation to the place where it is needed.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 287d5d4c97..29cee344b2 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -47,24 +47,23 @@ static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset, rb_aligned_offset;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
-    ram_addr_t ram_offset, host_page_base;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
 
     if (rb_page_size == BALLOON_PAGE_SIZE) {
         /* Easy case */
 
-        ram_block_discard_range(rb, ram_offset, rb_page_size);
+        ram_block_discard_range(rb, rb_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
@@ -80,11 +79,12 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
     warn_report_once(
 "Balloon used with backing page size > 4kiB, this may not be reliable");
 
+    rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
     if (balloon->pbp
         && (rb != balloon->pbp->rb
-            || host_page_base != balloon->pbp->base)) {
+            || rb_aligned_offset != balloon->pbp->base)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
@@ -96,10 +96,10 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
         balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
         balloon->pbp->rb = rb;
-        balloon->pbp->base = host_page_base;
+        balloon->pbp->base = rb_aligned_offset;
     }
 
-    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
             balloon->pbp->bitmap);
 
     if (bitmap_full(balloon->pbp->bitmap, subpages)) {
@@ -116,18 +116,18 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 }
 
 static void balloon_deflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
     if (balloon->pbp) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data
  2019-07-19 16:01 [Qemu-devel] [PATCH v1 0/3] virtio-balloon: PartialBalloonedPage rework David Hildenbrand
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp David Hildenbrand
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 2/3] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
@ 2019-07-19 16:01 ` David Hildenbrand
  2019-07-22  3:04   ` David Gibson
  2 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand @ 2019-07-19 16:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, David Gibson, Michael S . Tsirkin,
	Stefan Hajnoczi, David Hildenbrand

Using the address of a RAMBlock to test for a matching pbp is not really
safe. Instead, let's use the guest physical address of the base page
along with the page size (via the number of subpages).

While at it, move "struct PartiallyBalloonedPage" to virtio-balloon.h
now (previously most probably avoided to te the ramblock), renaming it.

Also, let's only dynamically allocating the bitmap. This makes the code
easier to read and maintain - we can reuse bitmap_new().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c         | 52 +++++++++++++++++-------------
 include/hw/virtio/virtio-balloon.h | 15 +++++++--
 2 files changed, 42 insertions(+), 25 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 29cee344b2..8e5f9459c2 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -34,23 +34,31 @@
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
-struct PartiallyBalloonedPage {
-    RAMBlock *rb;
-    ram_addr_t base;
-    unsigned long bitmap[];
-};
-
 static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
 {
-    g_free(balloon->pbp);
-    balloon->pbp = NULL;
+    balloon->pbp.base_gpa = 0;
+    balloon->pbp.subpages = 0;
+    g_free(balloon->pbp.bitmap);
+    balloon->pbp.bitmap = NULL;
+}
+
+static bool virtio_balloon_pbp_valid(VirtIOBalloon *balloon)
+{
+    return balloon->pbp.bitmap;
+}
+
+static bool virtio_balloon_pbp_matches(VirtIOBalloon *balloon,
+                                       ram_addr_t base_gpa, long subpages)
+{
+    return balloon->pbp.subpages == subpages &&
+           balloon->pbp.base_gpa == base_gpa;
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
                                  MemoryRegion *mr, hwaddr mr_offset)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
-    ram_addr_t rb_offset, rb_aligned_offset;
+    ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
@@ -81,32 +89,32 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 
     rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
+    base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
+               (rb_offset - rb_aligned_offset);
 
-    if (balloon->pbp
-        && (rb != balloon->pbp->rb
-            || rb_aligned_offset != balloon->pbp->base)) {
+    if (virtio_balloon_pbp_valid(balloon) &&
+        !virtio_balloon_pbp_matches(balloon, base_gpa, subpages)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
         virtio_balloon_reset_pbp(balloon);
     }
 
-    if (!balloon->pbp) {
+    if (!virtio_balloon_pbp_valid(balloon)) {
         /* Starting on a new host page */
-        size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
-        balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
-        balloon->pbp->rb = rb;
-        balloon->pbp->base = rb_aligned_offset;
+        balloon->pbp.base_gpa = base_gpa;
+        balloon->pbp.subpages = subpages;
+        balloon->pbp.bitmap = bitmap_new(subpages);
     }
 
-    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-            balloon->pbp->bitmap);
+    set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
+            balloon->pbp.bitmap);
 
-    if (bitmap_full(balloon->pbp->bitmap, subpages)) {
+    if (bitmap_full(balloon->pbp.bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
          * it now */
 
-        ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
+        ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
@@ -130,7 +138,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
-    if (balloon->pbp) {
+    if (virtio_balloon_pbp_valid(balloon)) {
         /* Let's play safe and always reset the pbp on deflation requests. */
         virtio_balloon_reset_pbp(balloon);
     }
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 5a99293a45..0190d78d71 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
-typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
-
 enum virtio_balloon_free_page_report_status {
     FREE_PAGE_REPORT_S_STOP = 0,
     FREE_PAGE_REPORT_S_REQUESTED = 1,
@@ -42,6 +40,12 @@ enum virtio_balloon_free_page_report_status {
     FREE_PAGE_REPORT_S_DONE = 3,
 };
 
+typedef struct VirtIOPartiallyBalloonedPage {
+    ram_addr_t base_gpa;
+    long subpages;
+    unsigned long *bitmap;
+} VirtIOPartiallyBalloonedPage;
+
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
     VirtQueue *ivq, *dvq, *svq, *free_page_vq;
@@ -70,7 +74,12 @@ typedef struct VirtIOBalloon {
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;
-    PartiallyBalloonedPage *pbp;
+
+    /*
+     * Information about a partially ballooned page - does not have to be
+     * migrated and has to be reset on every device reset.
+     */
+    VirtIOPartiallyBalloonedPage pbp;
 
     bool qemu_4_0_config_size;
 } VirtIOBalloon;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp David Hildenbrand
@ 2019-07-22  2:27   ` David Gibson
  0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2019-07-22  2:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-devel, Stefan Hajnoczi, Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 2568 bytes --]

On Fri, Jul 19, 2019 at 06:01:18PM +0200, David Hildenbrand wrote:
> Let's simplify this - the case we are optimizing for is very hard to
> trigger and not worth the effort. If we're switching from inflation to
> deflation, let's reset the pbp.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/virtio/virtio-balloon.c | 24 ++++--------------------
>  1 file changed, 4 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 9de3c030bf..287d5d4c97 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -121,7 +121,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>      void *addr = memory_region_get_ram_ptr(mr) + offset;
>      RAMBlock *rb;
>      size_t rb_page_size;
> -    ram_addr_t ram_offset, host_page_base;
> +    ram_addr_t ram_offset;
>      void *host_addr;
>      int ret;
>  
> @@ -129,26 +129,10 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>       * host address? */
>      rb = qemu_ram_block_from_host(addr, false, &ram_offset);
>      rb_page_size = qemu_ram_pagesize(rb);
> -    host_page_base = ram_offset & ~(rb_page_size - 1);
> -
> -    if (balloon->pbp
> -        && rb == balloon->pbp->rb
> -        && host_page_base == balloon->pbp->base) {
> -        int subpages = rb_page_size / BALLOON_PAGE_SIZE;
>  
> -        /*
> -         * This means the guest has asked to discard some of the 4kiB
> -         * subpages of a host page, but then changed its mind and
> -         * asked to keep them after all.  It's exceedingly unlikely
> -         * for a guest to do this in practice, but handle it anyway,
> -         * since getting it wrong could mean discarding memory the
> -         * guest is still using. */
> -        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> -                  balloon->pbp->bitmap);
> -
> -        if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
> -            virtio_balloon_reset_pbp(balloon);
> -        }
> +    if (balloon->pbp) {
> +        /* Let's play safe and always reset the pbp on deflation requests. */
> +        virtio_balloon_reset_pbp(balloon);
>      }
>  
>      host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/3] virtio-balloon: Better names for offset variables in inflate/deflate code
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 2/3] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
@ 2019-07-22  2:29   ` David Gibson
  0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2019-07-22  2:29 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-devel, Stefan Hajnoczi, Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 4628 bytes --]

On Fri, Jul 19, 2019 at 06:01:19PM +0200, David Hildenbrand wrote:
> "host_page_base" is really confusing, let's make this clearer, also
> rename the other offsets to indicate to which base they apply.
> 
> offset -> mr_offset
> ram_offset -> rb_offset
> host_page_base -> rb_aligned_offset
> 
> While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
> and move the computation to the place where it is needed.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/virtio/virtio-balloon.c | 26 +++++++++++++-------------
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 287d5d4c97..29cee344b2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -47,24 +47,23 @@ static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
>  }
>  
>  static void balloon_inflate_page(VirtIOBalloon *balloon,
> -                                 MemoryRegion *mr, hwaddr offset)
> +                                 MemoryRegion *mr, hwaddr mr_offset)
>  {
> -    void *addr = memory_region_get_ram_ptr(mr) + offset;
> +    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
> +    ram_addr_t rb_offset, rb_aligned_offset;
>      RAMBlock *rb;
>      size_t rb_page_size;
>      int subpages;
> -    ram_addr_t ram_offset, host_page_base;
>  
>      /* XXX is there a better way to get to the RAMBlock than via a
>       * host address? */
> -    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
> +    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
>      rb_page_size = qemu_ram_pagesize(rb);
> -    host_page_base = ram_offset & ~(rb_page_size - 1);
>  
>      if (rb_page_size == BALLOON_PAGE_SIZE) {
>          /* Easy case */
>  
> -        ram_block_discard_range(rb, ram_offset, rb_page_size);
> +        ram_block_discard_range(rb, rb_offset, rb_page_size);
>          /* We ignore errors from ram_block_discard_range(), because it
>           * has already reported them, and failing to discard a balloon
>           * page is not fatal */
> @@ -80,11 +79,12 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>      warn_report_once(
>  "Balloon used with backing page size > 4kiB, this may not be reliable");
>  
> +    rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
>      subpages = rb_page_size / BALLOON_PAGE_SIZE;
>  
>      if (balloon->pbp
>          && (rb != balloon->pbp->rb
> -            || host_page_base != balloon->pbp->base)) {
> +            || rb_aligned_offset != balloon->pbp->base)) {
>          /* We've partially ballooned part of a host page, but now
>           * we're trying to balloon part of a different one.  Too hard,
>           * give up on the old partial page */
> @@ -96,10 +96,10 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>          size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
>          balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
>          balloon->pbp->rb = rb;
> -        balloon->pbp->base = host_page_base;
> +        balloon->pbp->base = rb_aligned_offset;
>      }
>  
> -    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> +    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>              balloon->pbp->bitmap);
>  
>      if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> @@ -116,18 +116,18 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>  }
>  
>  static void balloon_deflate_page(VirtIOBalloon *balloon,
> -                                 MemoryRegion *mr, hwaddr offset)
> +                                 MemoryRegion *mr, hwaddr mr_offset)
>  {
> -    void *addr = memory_region_get_ram_ptr(mr) + offset;
> +    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
> +    ram_addr_t rb_offset;
>      RAMBlock *rb;
>      size_t rb_page_size;
> -    ram_addr_t ram_offset;
>      void *host_addr;
>      int ret;
>  
>      /* XXX is there a better way to get to the RAMBlock than via a
>       * host address? */
> -    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
> +    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
>      rb_page_size = qemu_ram_pagesize(rb);
>  
>      if (balloon->pbp) {

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data
  2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data David Hildenbrand
@ 2019-07-22  3:04   ` David Gibson
  2019-07-22  7:43     ` David Hildenbrand
  0 siblings, 1 reply; 9+ messages in thread
From: David Gibson @ 2019-07-22  3:04 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-devel, Stefan Hajnoczi, Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 7011 bytes --]

On Fri, Jul 19, 2019 at 06:01:20PM +0200, David Hildenbrand wrote:
> Using the address of a RAMBlock to test for a matching pbp is not really
> safe. Instead, let's use the guest physical address of the base page
> along with the page size (via the number of subpages).
> 
> While at it, move "struct PartiallyBalloonedPage" to virtio-balloon.h
> now (previously most probably avoided to te the ramblock), renaming it.
> 
> Also, let's only dynamically allocating the bitmap. This makes the code
> easier to read and maintain - we can reuse bitmap_new().
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

This mostly looks good, but one thing bothers me..

> ---
>  hw/virtio/virtio-balloon.c         | 52 +++++++++++++++++-------------
>  include/hw/virtio/virtio-balloon.h | 15 +++++++--
>  2 files changed, 42 insertions(+), 25 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 29cee344b2..8e5f9459c2 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -34,23 +34,31 @@
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> -struct PartiallyBalloonedPage {
> -    RAMBlock *rb;
> -    ram_addr_t base;
> -    unsigned long bitmap[];
> -};
> -
>  static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
>  {
> -    g_free(balloon->pbp);
> -    balloon->pbp = NULL;
> +    balloon->pbp.base_gpa = 0;
> +    balloon->pbp.subpages = 0;
> +    g_free(balloon->pbp.bitmap);
> +    balloon->pbp.bitmap = NULL;
> +}
> +
> +static bool virtio_balloon_pbp_valid(VirtIOBalloon *balloon)
> +{
> +    return balloon->pbp.bitmap;
> +}
> +
> +static bool virtio_balloon_pbp_matches(VirtIOBalloon *balloon,
> +                                       ram_addr_t base_gpa, long subpages)
> +{
> +    return balloon->pbp.subpages == subpages &&
> +           balloon->pbp.base_gpa == base_gpa;

.. so, I'm trying to think what base_gpa matching, but subpages not
matching means.  I think that can only happen if a pbp is created,
then the ramblock it was in is unplugged, then another ramblock is
plugged in covering the same address and with a different (host) page
size.  Which is unlikely but possible, IIUC.  However, also possible
and marginally less unlikely is to plug a new block of the same page
size, in which case this *will* match, but presumably the pbp should
still be discarded.

>  }
>  
>  static void balloon_inflate_page(VirtIOBalloon *balloon,
>                                   MemoryRegion *mr, hwaddr mr_offset)
>  {
>      void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
> -    ram_addr_t rb_offset, rb_aligned_offset;
> +    ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
>      RAMBlock *rb;
>      size_t rb_page_size;
>      int subpages;
> @@ -81,32 +89,32 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>  
>      rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
>      subpages = rb_page_size / BALLOON_PAGE_SIZE;
> +    base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
> +               (rb_offset - rb_aligned_offset);
>  
> -    if (balloon->pbp
> -        && (rb != balloon->pbp->rb
> -            || rb_aligned_offset != balloon->pbp->base)) {
> +    if (virtio_balloon_pbp_valid(balloon) &&
> +        !virtio_balloon_pbp_matches(balloon, base_gpa, subpages)) {
>          /* We've partially ballooned part of a host page, but now
>           * we're trying to balloon part of a different one.  Too hard,
>           * give up on the old partial page */
>          virtio_balloon_reset_pbp(balloon);
>      }
>  
> -    if (!balloon->pbp) {
> +    if (!virtio_balloon_pbp_valid(balloon)) {
>          /* Starting on a new host page */
> -        size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
> -        balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
> -        balloon->pbp->rb = rb;
> -        balloon->pbp->base = rb_aligned_offset;
> +        balloon->pbp.base_gpa = base_gpa;
> +        balloon->pbp.subpages = subpages;
> +        balloon->pbp.bitmap = bitmap_new(subpages);
>      }
>  
> -    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> -            balloon->pbp->bitmap);
> +    set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
> +            balloon->pbp.bitmap);
>  
> -    if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> +    if (bitmap_full(balloon->pbp.bitmap, subpages)) {
>          /* We've accumulated a full host page, we can actually discard
>           * it now */
>  
> -        ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
> +        ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
>          /* We ignore errors from ram_block_discard_range(), because it
>           * has already reported them, and failing to discard a balloon
>           * page is not fatal */
> @@ -130,7 +138,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>      rb = qemu_ram_block_from_host(addr, false, &rb_offset);
>      rb_page_size = qemu_ram_pagesize(rb);
>  
> -    if (balloon->pbp) {
> +    if (virtio_balloon_pbp_valid(balloon)) {
>          /* Let's play safe and always reset the pbp on deflation requests. */
>          virtio_balloon_reset_pbp(balloon);
>      }
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 5a99293a45..0190d78d71 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
>         uint64_t val;
>  } VirtIOBalloonStatModern;
>  
> -typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
> -
>  enum virtio_balloon_free_page_report_status {
>      FREE_PAGE_REPORT_S_STOP = 0,
>      FREE_PAGE_REPORT_S_REQUESTED = 1,
> @@ -42,6 +40,12 @@ enum virtio_balloon_free_page_report_status {
>      FREE_PAGE_REPORT_S_DONE = 3,
>  };
>  
> +typedef struct VirtIOPartiallyBalloonedPage {
> +    ram_addr_t base_gpa;
> +    long subpages;
> +    unsigned long *bitmap;
> +} VirtIOPartiallyBalloonedPage;
> +
>  typedef struct VirtIOBalloon {
>      VirtIODevice parent_obj;
>      VirtQueue *ivq, *dvq, *svq, *free_page_vq;
> @@ -70,7 +74,12 @@ typedef struct VirtIOBalloon {
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
>      uint32_t host_features;
> -    PartiallyBalloonedPage *pbp;
> +
> +    /*
> +     * Information about a partially ballooned page - does not have to be
> +     * migrated and has to be reset on every device reset.
> +     */
> +    VirtIOPartiallyBalloonedPage pbp;
>  
>      bool qemu_4_0_config_size;
>  } VirtIOBalloon;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data
  2019-07-22  3:04   ` David Gibson
@ 2019-07-22  7:43     ` David Hildenbrand
  2019-07-22  8:14       ` Michael S. Tsirkin
  0 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand @ 2019-07-22  7:43 UTC (permalink / raw)
  To: David Gibson
  Cc: Igor Mammedov, qemu-devel, Stefan Hajnoczi, Michael S . Tsirkin

On 22.07.19 05:04, David Gibson wrote:
> On Fri, Jul 19, 2019 at 06:01:20PM +0200, David Hildenbrand wrote:
>> Using the address of a RAMBlock to test for a matching pbp is not really
>> safe. Instead, let's use the guest physical address of the base page
>> along with the page size (via the number of subpages).
>>
>> While at it, move "struct PartiallyBalloonedPage" to virtio-balloon.h
>> now (previously most probably avoided to te the ramblock), renaming it.
>>
>> Also, let's only dynamically allocating the bitmap. This makes the code
>> easier to read and maintain - we can reuse bitmap_new().
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> This mostly looks good, but one thing bothers me..
> 
>> ---
>>  hw/virtio/virtio-balloon.c         | 52 +++++++++++++++++-------------
>>  include/hw/virtio/virtio-balloon.h | 15 +++++++--
>>  2 files changed, 42 insertions(+), 25 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 29cee344b2..8e5f9459c2 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -34,23 +34,31 @@
>>  
>>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>>  
>> -struct PartiallyBalloonedPage {
>> -    RAMBlock *rb;
>> -    ram_addr_t base;
>> -    unsigned long bitmap[];
>> -};
>> -
>>  static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
>>  {
>> -    g_free(balloon->pbp);
>> -    balloon->pbp = NULL;
>> +    balloon->pbp.base_gpa = 0;
>> +    balloon->pbp.subpages = 0;
>> +    g_free(balloon->pbp.bitmap);
>> +    balloon->pbp.bitmap = NULL;
>> +}
>> +
>> +static bool virtio_balloon_pbp_valid(VirtIOBalloon *balloon)
>> +{
>> +    return balloon->pbp.bitmap;
>> +}
>> +
>> +static bool virtio_balloon_pbp_matches(VirtIOBalloon *balloon,
>> +                                       ram_addr_t base_gpa, long subpages)
>> +{
>> +    return balloon->pbp.subpages == subpages &&
>> +           balloon->pbp.base_gpa == base_gpa;
> 
> .. so, I'm trying to think what base_gpa matching, but subpages not
> matching means.  I think that can only happen if a pbp is created,
> then the ramblock it was in is unplugged, then another ramblock is
> plugged in covering the same address and with a different (host) page
> size.  Which is unlikely but possible, IIUC.  However, also possible
> and marginally less unlikely is to plug a new block of the same page
> size, in which case this *will* match, but presumably the pbp should
> still be discarded.

Why should it be discarded? The guest didn't deflate, so if we drop the
backing page, it works as expected. If the guest deflated, the pbp would
be discarded.

Please note that this will not happen in real life (Linux):

Before we unplug a DIMM, the guest has to offline that memory. Offlining
means evacuating all pages that are not free (Buddy). Balloon-inflated
pages are treated like allocated pages, so the balloon pages will have
to be moved to a different location (inflate new + deflate old). At this
point, we had a deflate on the page and dropped the pbp.

If the guest would be reusing memory (after unplug/replug) and leave
parts of the memory inflated, it would be expected that something goes
wrong - especially, also the balloon stats would be most probably wrong.

Using ramblock notifiers, we could discard the pbp in case a new block
is added/removed, however I am not convinced that this is really needed.



However, there is (yet) another related issue with PBP. In QEMU, we
don't set

#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */

Which means it would right now be valid for the guest to reuse pages
tracked in the PBP without deflating (although Linux always tells the
host). This could result in stale PBP data.

We really should be setting VIRTIO_BALLOON_F_MUST_TELL_HOST.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data
  2019-07-22  7:43     ` David Hildenbrand
@ 2019-07-22  8:14       ` Michael S. Tsirkin
  0 siblings, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2019-07-22  8:14 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-devel, Stefan Hajnoczi, David Gibson

On Mon, Jul 22, 2019 at 09:43:43AM +0200, David Hildenbrand wrote:
> On 22.07.19 05:04, David Gibson wrote:
> > On Fri, Jul 19, 2019 at 06:01:20PM +0200, David Hildenbrand wrote:
> >> Using the address of a RAMBlock to test for a matching pbp is not really
> >> safe. Instead, let's use the guest physical address of the base page
> >> along with the page size (via the number of subpages).
> >>
> >> While at it, move "struct PartiallyBalloonedPage" to virtio-balloon.h
> >> now (previously most probably avoided to te the ramblock), renaming it.
> >>
> >> Also, let's only dynamically allocating the bitmap. This makes the code
> >> easier to read and maintain - we can reuse bitmap_new().
> >>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> > 
> > This mostly looks good, but one thing bothers me..
> > 
> >> ---
> >>  hw/virtio/virtio-balloon.c         | 52 +++++++++++++++++-------------
> >>  include/hw/virtio/virtio-balloon.h | 15 +++++++--
> >>  2 files changed, 42 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >> index 29cee344b2..8e5f9459c2 100644
> >> --- a/hw/virtio/virtio-balloon.c
> >> +++ b/hw/virtio/virtio-balloon.c
> >> @@ -34,23 +34,31 @@
> >>  
> >>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
> >>  
> >> -struct PartiallyBalloonedPage {
> >> -    RAMBlock *rb;
> >> -    ram_addr_t base;
> >> -    unsigned long bitmap[];
> >> -};
> >> -
> >>  static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
> >>  {
> >> -    g_free(balloon->pbp);
> >> -    balloon->pbp = NULL;
> >> +    balloon->pbp.base_gpa = 0;
> >> +    balloon->pbp.subpages = 0;
> >> +    g_free(balloon->pbp.bitmap);
> >> +    balloon->pbp.bitmap = NULL;
> >> +}
> >> +
> >> +static bool virtio_balloon_pbp_valid(VirtIOBalloon *balloon)
> >> +{
> >> +    return balloon->pbp.bitmap;
> >> +}
> >> +
> >> +static bool virtio_balloon_pbp_matches(VirtIOBalloon *balloon,
> >> +                                       ram_addr_t base_gpa, long subpages)
> >> +{
> >> +    return balloon->pbp.subpages == subpages &&
> >> +           balloon->pbp.base_gpa == base_gpa;
> > 
> > .. so, I'm trying to think what base_gpa matching, but subpages not
> > matching means.  I think that can only happen if a pbp is created,
> > then the ramblock it was in is unplugged, then another ramblock is
> > plugged in covering the same address and with a different (host) page
> > size.  Which is unlikely but possible, IIUC.  However, also possible
> > and marginally less unlikely is to plug a new block of the same page
> > size, in which case this *will* match, but presumably the pbp should
> > still be discarded.
> 
> Why should it be discarded? The guest didn't deflate, so if we drop the
> backing page, it works as expected. If the guest deflated, the pbp would
> be discarded.
> 
> Please note that this will not happen in real life (Linux):
> 
> Before we unplug a DIMM, the guest has to offline that memory. Offlining
> means evacuating all pages that are not free (Buddy). Balloon-inflated
> pages are treated like allocated pages, so the balloon pages will have
> to be moved to a different location (inflate new + deflate old). At this
> point, we had a deflate on the page and dropped the pbp.
> 
> If the guest would be reusing memory (after unplug/replug) and leave
> parts of the memory inflated, it would be expected that something goes
> wrong - especially, also the balloon stats would be most probably wrong.
> 
> Using ramblock notifiers, we could discard the pbp in case a new block
> is added/removed, however I am not convinced that this is really needed.

I think I agree here.

> 
> 
> However, there is (yet) another related issue with PBP. In QEMU, we
> don't set
> 
> #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
> 
> Which means it would right now be valid for the guest to reuse pages
> tracked in the PBP without deflating (although Linux always tells the
> host). This could result in stale PBP data.
> 
> We really should be setting VIRTIO_BALLOON_F_MUST_TELL_HOST.

Ouch. More than that, anything using PBP is already corrupting
guest memory with legacy guests (before 3.0).
Well given it's also corrupting host memory, I'm not sure
whether we should care ...


> -- 
> 
> Thanks,
> 
> David / dhildenb


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-07-22  8:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 16:01 [Qemu-devel] [PATCH v1 0/3] virtio-balloon: PartialBalloonedPage rework David Hildenbrand
2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp David Hildenbrand
2019-07-22  2:27   ` David Gibson
2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 2/3] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
2019-07-22  2:29   ` David Gibson
2019-07-19 16:01 ` [Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data David Hildenbrand
2019-07-22  3:04   ` David Gibson
2019-07-22  7:43     ` David Hildenbrand
2019-07-22  8:14       ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).