kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	mst@redhat.com, david@redhat.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, akpm@linux-foundation.org
Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com,
	konrad.wilk@oracle.com, nitesh@redhat.com, riel@surriel.com,
	willy@infradead.org, lcapitulino@redhat.com,
	dave.hansen@intel.com, wei.w.wang@intel.com, aarcange@redhat.com,
	pbonzini@redhat.com, dan.j.williams@intel.com, mhocko@kernel.org,
	mgorman@techsingularity.net, alexander.h.duyck@linux.intel.com,
	vbabka@suse.cz, osalvador@suse.de
Subject: [PATCH v17 QEMU 4/3 RFC] memory: Add support for MADV_FREE as mechanism to lazy discard pages
Date: Tue, 11 Feb 2020 14:53:18 -0800	[thread overview]
Message-ID: <20200211225220.30596.80416.stgit@localhost.localdomain> (raw)
In-Reply-To: <20200211224416.29318.44077.stgit@localhost.localdomain>

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

Add support for the MADV_FREE advice argument when discarding pages.
Specifically we add an option to perform a lazy discard for use with free
page reporting as this allows us to avoid expensive page zeroing in the
case that the system is not under memory pressure.

To enable this I simply extended the ram_block_discard_range function to
add an extra parameter for "lazy" freeing. I then renamed the function,
wrapped it in a function defined using the original name and defaulting
lazy to false. From there I created a second wrapper for
ram_block_free_range and updated the page reporting code to use that.

Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 exec.c                     |   39 +++++++++++++++++++++++++++------------
 hw/virtio/virtio-balloon.c |    2 +-
 include/exec/cpu-common.h  |    1 +
 3 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/exec.c b/exec.c
index 67e520d18ea5..2266574eb06e 100644
--- a/exec.c
+++ b/exec.c
@@ -3881,15 +3881,8 @@ int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
     return ret;
 }
 
-/*
- * Unmap pages of memory from start to start+length such that
- * they a) read as 0, b) Trigger whatever fault mechanism
- * the OS provides for postcopy.
- * The pages must be unmapped by the end of the function.
- * Returns: 0 on success, none-0 on failure
- *
- */
-int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
+static int __ram_block_discard_range(RAMBlock *rb, uint64_t start,
+                                     size_t length, bool lazy)
 {
     int ret = -1;
 
@@ -3941,13 +3934,18 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
 #endif
         }
         if (need_madvise) {
-            /* For normal RAM this causes it to be unmapped,
+#ifdef CONFIG_MADVISE
+#ifdef MADV_FREE
+            int advice = (lazy && !need_fallocate) ? MADV_FREE : MADV_DONTNEED;
+#else
+            int advice = MADV_DONTNEED;
+#endif
+            /* For normal RAM this causes it to be lazy freed or unmapped,
              * for shared memory it causes the local mapping to disappear
              * and to fall back on the file contents (which we just
              * fallocate'd away).
              */
-#if defined(CONFIG_MADVISE)
-            ret =  madvise(host_startaddr, length, MADV_DONTNEED);
+            ret =  madvise(host_startaddr, length, advice);
             if (ret) {
                 ret = -errno;
                 error_report("ram_block_discard_range: Failed to discard range "
@@ -3975,6 +3973,23 @@ err:
     return ret;
 }
 
+/*
+ * Unmap pages of memory from start to start+length such that
+ * they a) read as 0, b) Trigger whatever fault mechanism
+ * the OS provides for postcopy.
+ * The pages must be unmapped by the end of the function.
+ * Returns: 0 on success, none-0 on failure
+ *
+ */
+int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
+{
+    return __ram_block_discard_range(rb, start, length, false);
+}
+
+int ram_block_free_range(RAMBlock *rb, uint64_t start, size_t length)
+{
+    return __ram_block_discard_range(rb, start, length, true);
+}
 bool ramblock_is_pmem(RAMBlock *rb)
 {
     return rb->flags & RAM_PMEM;
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 5faafd2f62ac..7df92af73792 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -346,7 +346,7 @@ static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq)
             if ((ram_offset | size) & (rb_page_size - 1))
                 continue;
 
-            ram_block_discard_range(rb, ram_offset, size);
+            ram_block_free_range(rb, ram_offset, size);
         }
 
         virtqueue_push(vq, elem, 0);
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 81753bbb3431..2bbd26784c63 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -104,6 +104,7 @@ typedef int (RAMBlockIterFunc)(RAMBlock *rb, void *opaque);
 
 int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque);
 int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length);
+int ram_block_free_range(RAMBlock *rb, uint64_t start, size_t length);
 
 #endif
 


  parent reply	other threads:[~2020-02-11 22:53 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11 22:45 [PATCH v17 0/9] mm / virtio: Provide support for free page reporting Alexander Duyck
2020-02-11 22:46 ` [PATCH v17 1/9] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck
2020-02-11 22:46 ` [PATCH v17 2/9] mm: Use zone and order instead of free area in free_list manipulators Alexander Duyck
2020-02-11 22:46 ` [PATCH v17 3/9] mm: Add function __putback_isolated_page Alexander Duyck
2020-02-19 14:33   ` Mel Gorman
2020-02-11 22:46 ` [PATCH v17 4/9] mm: Introduce Reported pages Alexander Duyck
2020-02-19 14:55   ` Mel Gorman
2020-02-20 18:44     ` Alexander Duyck
2020-02-20 22:35       ` Mel Gorman
2020-02-21 19:25         ` Alexander Duyck
2020-02-21 20:19           ` Mel Gorman
2020-02-11 22:46 ` [PATCH v17 5/9] virtio-balloon: Pull page poisoning config out of free page hinting Alexander Duyck
2020-02-11 22:46 ` [PATCH v17 6/9] virtio-balloon: Add support for providing free page reports to host Alexander Duyck
2020-02-11 22:47 ` [PATCH v17 7/9] mm/page_reporting: Rotate reported pages to the tail of the list Alexander Duyck
2020-02-19 14:59   ` Mel Gorman
2020-02-11 22:47 ` [PATCH v17 8/9] mm/page_reporting: Add budget limit on how many pages can be reported per pass Alexander Duyck
2020-02-19 15:02   ` Mel Gorman
2020-02-11 22:47 ` [PATCH v17 9/9] mm/page_reporting: Add free page reporting documentation Alexander Duyck
2020-02-11 22:51 ` [PATCH v17 QEMU 1/3] virtio-ballon: Implement support for page poison tracking feature Alexander Duyck
2020-02-11 22:51 ` [PATCH v17 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host Alexander Duyck
2020-02-11 22:51 ` [PATCH v17 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting Alexander Duyck
2020-02-11 22:53 ` Alexander Duyck [this message]
2020-02-11 23:05 ` [PATCH v17 0/9] mm / virtio: Provide support " Andrew Morton
2020-02-11 23:55   ` Alexander Duyck
2020-02-12  0:19     ` Andrew Morton
2020-02-12  1:19       ` Alexander Duyck
2020-02-18 16:37       ` Alexander Duyck
2020-02-19  8:49         ` Mel Gorman
2020-02-19 15:06         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200211225220.30596.80416.stgit@localhost.localdomain \
    --to=alexander.duyck@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=nitesh@redhat.com \
    --cc=osalvador@suse.de \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=vbabka@suse.cz \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).