linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com,
	mst@redhat.com, dave.hansen@intel.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com,
	konrad.wilk@oracle.com, lcapitulino@redhat.com,
	wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com,
	dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com
Subject: [RFC PATCH 00/11] mm / virtio: Provide support for paravirtual waste page treatment
Date: Thu, 30 May 2019 14:53:34 -0700	[thread overview]
Message-ID: <20190530215223.13974.22445.stgit@localhost.localdomain> (raw)

This series provides an asynchronous means of hinting to a hypervisor
that a guest page is no longer in use and can have the data associated
with it dropped. To do this I have implemented functionality that allows
for what I am referring to as "waste page treatment".

I have based many of the terms and functionality off of waste water
treatment, the idea for the similarity occured to me after I had reached
the point of referring to the hints as "bubbles", as the hints used the
same approach as the balloon functionality but would disappear if they
were touched, as a result I started to think of the virtio device as an
aerator. The general idea with all of this is that the guest should be
treating the unused pages so that when they end up heading "downstream"
to either another guest, or back at the host they will not need to be
written to swap.

So for a bit of background for the treatment process, it is based on a
sequencing batch reactor (SBR)[1]. The treatment process itself has five
stages. The first stage is the fill, with this we take the raw pages and
add them to the reactor. The second stage is react, in this stage we hand
the pages off to the Virtio Balloon driver to have hints attached to them
and for those hints to be sent to the hypervisor. The third stage is
settle, in this stage we are waiting for the hypervisor to process the
pages, and we should receive an interrupt when it is completed. The fourth
stage is to decant, or drain the reactor of pages. Finally we have the
idle stage which we will go into if the reference count for the reactor
gets down to 0 after a drain, or if a fill operation fails to obtain any
pages and the reference count has hit 0. Otherwise we return to the first
state and start the cycle over again.

This patch set is still far more intrusive then I would really like for
what it has to do. Currently I am splitting the nr_free_pages into two
values and having to add a pointer and an index to track where we area in
the treatment process for a given free_area. I'm also not sure I have
covered all possible corner cases where pages can get into the free_area
or move from one migratetype to another.

Also I am still leaving a number of things hard-coded such as limiting the
lowest order processed to PAGEBLOCK_ORDER, and have left it up to the
guest to determine what size of reactor it wants to allocate to process
the hints.

Another consideration I am still debating is if I really want to process
the aerator_cycle() function in interrupt context or if I should have it
running in a thread somewhere else.

[1]: https://en.wikipedia.org/wiki/Sequencing_batch_reactor

---

Alexander Duyck (11):
      mm: Move MAX_ORDER definition closer to pageblock_order
      mm: Adjust shuffle code to allow for future coalescing
      mm: Add support for Treated Buddy pages
      mm: Split nr_free into nr_free_raw and nr_free_treated
      mm: Propogate Treated bit when splitting
      mm: Add membrane to free area to use as divider between treated and raw pages
      mm: Add support for acquiring first free "raw" or "untreated" page in zone
      mm: Add support for creating memory aeration
      mm: Count isolated pages as "treated"
      virtio-balloon: Add support for aerating memory via bubble hinting
      mm: Add free page notification hook


 arch/x86/include/asm/page.h         |   11 +
 drivers/virtio/Kconfig              |    1 
 drivers/virtio/virtio_balloon.c     |   89 ++++++++++
 include/linux/gfp.h                 |   10 +
 include/linux/memory_aeration.h     |   54 ++++++
 include/linux/mmzone.h              |  100 +++++++++--
 include/linux/page-flags.h          |   32 +++
 include/linux/pageblock-flags.h     |    8 +
 include/uapi/linux/virtio_balloon.h |    1 
 mm/Kconfig                          |    5 +
 mm/Makefile                         |    1 
 mm/aeration.c                       |  324 +++++++++++++++++++++++++++++++++++
 mm/compaction.c                     |    4 
 mm/page_alloc.c                     |  220 ++++++++++++++++++++----
 mm/shuffle.c                        |   24 ---
 mm/shuffle.h                        |   35 ++++
 mm/vmstat.c                         |    5 -
 17 files changed, 838 insertions(+), 86 deletions(-)
 create mode 100644 include/linux/memory_aeration.h
 create mode 100644 mm/aeration.c

--

             reply	other threads:[~2019-05-30 21:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-30 21:53 Alexander Duyck [this message]
2019-05-30 21:53 ` [RFC PATCH 01/11] mm: Move MAX_ORDER definition closer to pageblock_order Alexander Duyck
2019-05-30 21:53 ` [RFC PATCH 02/11] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck
2019-05-30 21:53 ` [RFC PATCH 03/11] mm: Add support for Treated Buddy pages Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 04/11] mm: Split nr_free into nr_free_raw and nr_free_treated Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 05/11] mm: Propogate Treated bit when splitting Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 06/11] mm: Add membrane to free area to use as divider between treated and raw pages Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 07/11] mm: Add support for acquiring first free "raw" or "untreated" page in zone Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 08/11] mm: Add support for creating memory aeration Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 09/11] mm: Count isolated pages as "treated" Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 10/11] virtio-balloon: Add support for aerating memory via bubble hinting Alexander Duyck
2019-05-30 21:54 ` [RFC PATCH 11/11] mm: Add free page notification hook Alexander Duyck
2019-05-30 21:57 ` [RFC QEMU PATCH] QEMU: Provide a interface for hinting based off of the balloon infrastructure Alexander Duyck
2019-05-30 22:52 ` [RFC PATCH 00/11] mm / virtio: Provide support for paravirtual waste page treatment Michael S. Tsirkin
2019-05-31 11:16 ` Nitesh Narayan Lal
2019-05-31 15:51   ` Alexander Duyck
2019-06-03  9:31 ` David Hildenbrand
2019-06-03 15:33   ` Alexander Duyck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190530215223.13974.22445.stgit@localhost.localdomain \
    --to=alexander.duyck@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=nitesh@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=wei.w.wang@intel.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).