All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: mst@redhat.com, qemu-devel@nongnu.org
Cc: qemu-ppc@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: [Qemu-devel] [PATCH 4/5] virtio-balloon: Use ram_block_discard_range() instead of raw madvise()
Date: Thu, 14 Feb 2019 15:39:15 +1100	[thread overview]
Message-ID: <20190214043916.22128-5-david@gibson.dropbear.id.au> (raw)
In-Reply-To: <20190214043916.22128-1-david@gibson.dropbear.id.au>

Currently, virtio-balloon uses madvise() with MADV_DONTNEED to actually
discard RAM pages inserted into the balloon.  This is basically a Linux
only interface (MADV_DONTNEED exists on some other platforms, but doesn't
always have the same semantics).  It also doesn't work on hugepages and has
some other limitations.

It turns out that postcopy also needs to discard chunks of memory, and uses
a better interface for it: ram_block_discard_range().  It doesn't cover
every case, but it covers more than going direct to madvise() and this
gives us a single place to update for more possibilities in future.

There are some subtleties here to maintain the current balloon behaviour:

* For now, we just ignore requests to balloon in a hugepage backed region.
  That matches current behaviour, because MADV_DONTNEED on a hugepage would
  simply fail, and we ignore the error.

* If host page size is > BALLOON_PAGE_SIZE we can frequently call this on
  non-host-page-aligned addresses.  These would also fail in madvise(),
  which we then ignored.  ram_block_discard_range() error_report()s calls
  on unaligned addresses, so we explicitly check that case to avoid
  spamming the logs.

* We now call ram_block_discard_range() with the *host* page size, whereas
  we previously called madvise() with BALLOON_PAGE_SIZE.  Surprisingly,
  this also matches existing behaviour.  Although the kernel fails madvise
  on unaligned addresses, it will round unaligned sizes *up* to the host
  page size.  Yes, this means that if BALLOON_PAGE_SIZE < guest page size
  we can incorrectly discard more memory than the guest asked us to.  I'm
  planning to address that soon.

Errors other than the ones discussed above, will now be reported by
ram_block_discard_range(), rather than silently ignored, which means we
have a much better chance of seeing when something is going wrong.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index bf93148486..e4cd8d566b 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -37,8 +37,29 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
                                  MemoryRegion *mr, hwaddr offset)
 {
     void *addr = memory_region_get_ram_ptr(mr) + offset;
+    RAMBlock *rb;
+    size_t rb_page_size;
+    ram_addr_t ram_offset;
 
-    qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED);
+    /* XXX is there a better way to get to the RAMBlock than via a
+     * host address? */
+    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb_page_size = qemu_ram_pagesize(rb);
+
+    /* Silently ignore hugepage RAM blocks */
+    if (rb_page_size != getpagesize()) {
+        return;
+    }
+
+    /* Silently ignore unaligned requests */
+    if (ram_offset & (rb_page_size - 1)) {
+        return;
+    }
+
+    ram_block_discard_range(rb, ram_offset, rb_page_size);
+    /* We ignore errors from ram_block_discard_range(), because it has
+     * already reported them, and failing to discard a balloon page is
+     * not fatal */
 }
 
 static const char *balloon_stat_names[] = {
-- 
2.20.1

  parent reply	other threads:[~2019-02-14  4:40 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14  4:39 [Qemu-devel] [PATCH 0/5] Improve balloon handling of pagesizes other than 4kiB David Gibson
2019-02-14  4:39 ` [Qemu-devel] [PATCH 1/5] virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate David Gibson
2019-02-28 13:36   ` Michael S. Tsirkin
2019-03-05  0:52     ` David Gibson
2019-03-05  2:29       ` Michael S. Tsirkin
2019-03-05  5:03         ` David Gibson
2019-03-05 14:41           ` Michael S. Tsirkin
2019-03-05 23:35             ` David Gibson
2019-03-06  0:14               ` Michael S. Tsirkin
2019-03-06  0:58                 ` David Gibson
2019-02-14  4:39 ` [Qemu-devel] [PATCH 2/5] virtio-balloon: Corrections to address verification David Gibson
2019-02-22  9:08   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-02-24 23:37     ` David Gibson
2019-02-25  9:26       ` Greg Kurz
2019-02-26 23:20         ` David Gibson
2019-02-28  9:09           ` Greg Kurz
2019-02-14  4:39 ` [Qemu-devel] [PATCH 3/5] virtio-balloon: Rework ballon_page() interface David Gibson
2019-02-14  4:39 ` David Gibson [this message]
2019-02-14  4:39 ` [Qemu-devel] [PATCH 5/5] virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size David Gibson
2019-03-05 16:06   ` [Qemu-devel] [PULL 23/26] " Peter Maydell
2019-03-05 23:33     ` David Gibson
2019-02-28 13:39 ` [Qemu-devel] [PATCH 0/5] Improve balloon handling of pagesizes other than 4kiB Michael S. Tsirkin
2019-03-05  0:53   ` David Gibson
2019-03-05  2:13     ` Michael S. Tsirkin
2019-03-05  4:55       ` David Gibson
2019-02-22  2:40 [Qemu-devel] [PULL 00/26] pci, pc, virtio: fixes, cleanups, tests Michael S. Tsirkin
2019-02-22 15:47 ` Peter Maydell
2019-02-22 15:53   ` Michael S. Tsirkin
2019-02-22 16:34     ` Peter Maydell
2019-02-24  0:34     ` Michael S. Tsirkin
2019-02-24 10:21       ` Peter Maydell
2019-02-24 16:41         ` Michael S. Tsirkin
2019-02-25 16:23           ` Philippe Mathieu-Daudé
2019-02-25 17:27             ` Peter Maydell
2019-02-24 22:49     ` David Gibson
2019-02-25 15:19 ` [Qemu-devel] [PULL v2 resend " Michael S. Tsirkin
2019-03-04 10:55   ` Paolo Bonzini
2019-03-04 13:38     ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190214043916.22128-5-david@gibson.dropbear.id.au \
    --to=david@gibson.dropbear.id.au \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.