LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com,
	mst@redhat.com, dave.hansen@intel.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org
Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com,
	konrad.wilk@oracle.com, lcapitulino@redhat.com,
	wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com,
	dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com
Subject: [PATCH v2 0/5] mm / virtio: Provide support for page hinting
Date: Wed, 24 Jul 2019 09:54:02 -0700
Message-ID: <20190724165158.6685.87228.stgit@localhost.localdomain> (raw)

This series provides an asynchronous means of hinting to a hypervisor
that a guest page is no longer in use and can have the data associated
with it dropped. To do this I have implemented functionality that allows
for what I am referring to as page hinting

The functionality for this is fairly simple. When enabled it will allocate
statistics to track the number of hinted pages in a given free area. When
the number of free pages exceeds this value plus a high water value,
currently 32, it will begin performing page hinting which consists of
pulling pages off of free list and placing them into a scatter list. The
scatterlist is then given to the page hinting device and it will perform
the required action to make the pages "hinted", in the case of
virtio-balloon this results in the pages being madvised as MADV_DONTNEED
and as such they are forced out of the guest. After this they are placed
back on the free list, and an additional bit is added if they are not
merged indicating that they are a hinted buddy page instead of a standard
buddy page. The cycle then repeats with additional non-hinted pages being
pulled until the free areas all consist of hinted pages.

I am leaving a number of things hard-coded such as limiting the lowest
order processed to PAGEBLOCK_ORDER, and have left it up to the guest to
determine what the limit is on how many pages it wants to allocate to
process the hints.

My primary testing has just been to verify the memory is being freed after
allocation by running memhog 79g on a 80g guest and watching the total
free memory via /proc/meminfo on the host. With this I have verified most
of the memory is freed after each iteration. As far as performance I have
been mainly focusing on the will-it-scale/page_fault1 test running with
16 vcpus. With that I have seen at most a 2% difference between the base
kernel without these patches and the patches with virtio-balloon disabled.
With the patches and virtio-balloon enabled with hinting the results
largely depend on the host kernel. On a 3.10 RHEL kernel I saw up to a 2%
drop in performance as I approached 16 threads, however on the the lastest
linux-next kernel I saw roughly a 4% to 5% improvement in performance for
all tests with 8 or more threads. I believe the difference seen is due to
the overhead for faulting pages back into the guest and zeroing of memory.

Patch 4 is a bit on the large side at about 600 lines of change, however
I really didn't see a good way to break it up since each piece feeds into
the next. So I couldn't add the statistics by themselves as it didn't
really make sense to add them without something that will either read or
increment/decrement them, or add the Hinted state without something that
would set/unset it. As such I just ended up adding the entire thing as
one patch. It makes it a bit bigger but avoids the issues in the previous
set where I was referencing things before they had been added.

Changes from the RFC:
https://lore.kernel.org/lkml/20190530215223.13974.22445.stgit@localhost.localdomain/
Moved aeration requested flag out of aerator and into zone->flags.
Moved bounary out of free_area and into local variables for aeration.
Moved aeration cycle out of interrupt and into workqueue.
Left nr_free as total pages instead of splitting it between raw and aerated.
Combined size and physical address values in virtio ring into one 64b value.

Changes from v1:
https://lore.kernel.org/lkml/20190619222922.1231.27432.stgit@localhost.localdomain/
Dropped "waste page treatment" in favor of "page hinting"
Renamed files and functions from "aeration" to "page_hinting"
Moved from page->lru list to scatterlist
Replaced wait on refcnt in shutdown with RCU and cancel_delayed_work_sync
Virtio now uses scatterlist directly instead of intermedate array
Moved stats out of free_area, now in seperate area and pointed to from zone
Merged patch 5 into patch 4 to improve reviewability
Updated various code comments throughout

---

Alexander Duyck (5):
      mm: Adjust shuffle code to allow for future coalescing
      mm: Move set/get_pcppage_migratetype to mmzone.h
      mm: Use zone and order instead of free area in free_list manipulators
      mm: Introduce Hinted pages
      virtio-balloon: Add support for providing page hints to host


 drivers/virtio/Kconfig              |    1 
 drivers/virtio/virtio_balloon.c     |   47 ++++++
 include/linux/mmzone.h              |  116 ++++++++------
 include/linux/page-flags.h          |    8 +
 include/linux/page_hinting.h        |  139 ++++++++++++++++
 include/uapi/linux/virtio_balloon.h |    1 
 mm/Kconfig                          |    5 +
 mm/Makefile                         |    1 
 mm/internal.h                       |   18 ++
 mm/memory_hotplug.c                 |    1 
 mm/page_alloc.c                     |  238 ++++++++++++++++++++--------
 mm/page_hinting.c                   |  298 +++++++++++++++++++++++++++++++++++
 mm/shuffle.c                        |   24 ---
 mm/shuffle.h                        |   32 ++++
 14 files changed, 796 insertions(+), 133 deletions(-)
 create mode 100644 include/linux/page_hinting.h
 create mode 100644 mm/page_hinting.c

--

             reply index

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-24 16:54 Alexander Duyck [this message]
2019-07-24 16:56 ` [PATCH v2 1/5] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck
2019-07-24 16:58 ` [PATCH v2 2/5] mm: Move set/get_pcppage_migratetype to mmzone.h Alexander Duyck
2019-07-24 17:00 ` [PATCH v2 3/5] mm: Use zone and order instead of free area in free_list manipulators Alexander Duyck
2019-07-24 17:03 ` [PATCH v2 4/5] mm: Introduce Hinted pages Alexander Duyck
2019-07-25  8:53   ` David Hildenbrand
2019-07-25 11:46     ` Nitesh Narayan Lal
2019-07-25 11:54       ` David Hildenbrand
2019-07-25 15:59     ` Alexander Duyck
2019-07-25 16:48       ` David Hildenbrand
2019-07-25 17:38         ` Alexander Duyck
2019-07-25 18:32           ` David Hildenbrand
2019-07-25 20:37             ` Alexander Duyck
2019-07-25 20:44               ` David Hildenbrand
2019-07-26 12:24   ` Nitesh Narayan Lal
2019-07-26 16:38     ` Alexander Duyck
2019-07-24 17:05 ` [PATCH v2 5/5] virtio-balloon: Add support for providing page hints to host Alexander Duyck
2019-07-24 19:02   ` Michael S. Tsirkin
2019-07-24 19:07     ` Nitesh Narayan Lal
2019-07-24 19:26       ` Michael S. Tsirkin
2019-07-24 20:37     ` Alexander Duyck
2019-07-24 20:43       ` Michael S. Tsirkin
2019-07-25 14:44     ` Nitesh Narayan Lal
2019-07-25 14:54       ` Michael S. Tsirkin
2019-07-25 14:56       ` Alexander Duyck
2019-07-25 14:59         ` Michael S. Tsirkin
2019-07-25 17:42   ` Nitesh Narayan Lal
2019-07-25 19:54     ` Alexander Duyck
2019-07-24 17:12 ` [PATCH v2 QEMU] virtio-balloon: Provide a interface for "bubble hinting" Alexander Duyck
2019-07-24 19:02   ` Michael S. Tsirkin
2019-07-24 20:18     ` Alexander Duyck
2019-07-24 20:29       ` Nitesh Narayan Lal
2019-07-24 20:42         ` Michael S. Tsirkin
2019-07-29 16:58           ` Alexander Duyck
2019-07-29 19:25             ` Michael S. Tsirkin
2019-07-29 20:21               ` Alexander Duyck
2019-07-29 20:49                 ` Michael S. Tsirkin
2019-07-29 21:37                   ` Alexander Duyck
2019-07-29 22:11                     ` Michael S. Tsirkin
2019-07-24 20:46       ` Michael S. Tsirkin
2019-07-24 21:14         ` Alexander Duyck
2019-07-25 11:57       ` Nitesh Narayan Lal
2019-07-25 14:57         ` Alexander Duyck
2019-07-24 21:38   ` Michael S. Tsirkin
2019-07-24 22:03     ` Alexander Duyck
2019-07-24 22:08       ` Michael S. Tsirkin
2019-07-24 22:27         ` Alexander Duyck
2019-07-25  6:07           ` Michael S. Tsirkin
2019-07-25 11:35       ` Nitesh Narayan Lal
2019-07-25 15:05         ` Alexander Duyck
2019-07-25 15:16           ` Michael S. Tsirkin
2019-07-25 16:16             ` Alexander Duyck
2019-07-25 17:19               ` Michael S. Tsirkin
2019-07-25 18:25               ` Nitesh Narayan Lal
2019-07-25 20:00                 ` Alexander Duyck
2019-07-25 20:14                   ` Nitesh Narayan Lal
2019-07-24 18:40 ` [PATCH v2 0/5] mm / virtio: Provide support for page hinting Nitesh Narayan Lal
2019-07-24 18:41   ` David Hildenbrand
2019-07-24 19:31     ` Michael S. Tsirkin
2019-07-24 19:47       ` David Hildenbrand
2019-07-24 19:54         ` Nitesh Narayan Lal
2019-07-24 21:32         ` Michael S. Tsirkin
2019-07-24 19:24   ` Michael S. Tsirkin
2019-07-24 20:27   ` Alexander Duyck
2019-07-24 20:38     ` Nitesh Narayan Lal
2019-07-24 21:00       ` Alexander Duyck
2019-07-25 12:08         ` Nitesh Narayan Lal
2019-07-24 20:38     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190724165158.6685.87228.stgit@localhost.localdomain \
    --to=alexander.duyck@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=nitesh@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=wei.w.wang@intel.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git