LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
To: Nitesh Narayan Lal <nitesh@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	kvm@vger.kernel.org, mst@redhat.com,
	linux-kernel@vger.kernel.org, willy@infradead.org,
	mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org,
	mgorman@techsingularity.net, vbabka@suse.cz
Cc: yang.zhang.wz@gmail.com, konrad.wilk@oracle.com,
	david@redhat.com, pagupta@redhat.com, riel@surriel.com,
	lcapitulino@redhat.com, dave.hansen@intel.com,
	wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com,
	dan.j.williams@intel.com, osalvador@suse.de
Subject: Re: [PATCH v12 0/6] mm / virtio: Provide support for unused page reporting
Date: Wed, 23 Oct 2019 15:24:41 -0700
Message-ID: <29f43d5796feed0dec8e8bb98b187d9dac03b900.camel@linux.intel.com> (raw)
In-Reply-To: <c50e102c-f72e-df8a-714f-a33897ddbb9f@redhat.com>

On Wed, 2019-10-23 at 07:35 -0400, Nitesh Narayan Lal wrote:
> On 10/22/19 6:27 PM, Alexander Duyck wrote:
> > This series provides an asynchronous means of reporting unused guest
> > pages to a hypervisor so that the memory associated with those pages can
> > be dropped and reused by other processes and/or guests.
> > 

<snip>

> > 
> I think Michal Hocko suggested us to include a brief detail about the background
> explaining how we ended up with the current approach and what all things we have
> already tried.
> That would help someone reviewing the patch-series for the first time to
> understand it in a better way.

I'm not entirely sure it helps. The problem is that even the "brief"
version will probably be pretty long.

From what I know the first real public discussion of guest memory
overcommit and free page hinting dates back to the 2011 KVM forum and a
presentation by Rik van Riel[0].

Before I got started in the code there was already virtio-balloon free
page hinting[1]. However it was meant to be an all-at-once reporting of
the free pages in the system at a given point in time, and used only for
VM migration. All it does is inflate a balloon until it encounters an OOM
and then it frees the memory back to the guest. One interesting piece that
came out of the work on that patch set was the suggestion by Linus to use
an array based incremental approach[2] which is what I based my later
implementation on.

I believe Nitesh had already been working on his own approach for unused
page hinting for some time at that point. Prior to submitting my RFC there
was already a v7 that had been submitted by Nitesh back in mid 2018[3].
The solution was an array based approach which appeared to instrument
arch_alloc_page and arch_free_page and would prevent allocations while
hinting was occurring.

The first RFC I had written[4] was a synchronous approach that made use of
arch_free_page to make a hypercall that would immediately flag the page as
being unused. However a hypercall per page can be expensive and we ideally
don't want the guest vCPU potentially being hung up while waiting on the
host mmap_sem.

At about this time I believe Nitesh's solution[5] was still trying to keep
an array of pages that were unused and tracking that via arch_free_page.
In the synchronous case it could cause OOM errors, and in the asynchronous
approach it had issues with being overrun and not being able to track
unused pages.

Later I switched to an asynchronous approach[6], originally calling it
"bubble hinting". With the asynchronous approach it is necessary to have a
way to track what pages have been reported and what haven't. I originally
was using the page type to track it as I had a Buddy and a TreatedBuddy,
but ultimately that moved to a "Reported" page flag. In addition I pulled
the counters and pointers out of the free_area/free_list  and instead now
have a stand-alone set of pointers and keep the reported statistics in a
separate dynamic allocation.

Then Nitesh's solution had changed to the bitmap approach[7]. However it
has been pointed out that this solution doesn't deal with sparse memory,
hotplug, and various other issues.

Since then both my approach and Nitesh's approach have been iterating with
mostly minor changes.

[0]: https://www.linux-kvm.org/images/f/ff/2011-forum-memory-overcommit.pdf
[1]: https://lore.kernel.org/lkml/1535333539-32420-1-git-send-email-wei.w.wang@intel.com/
[2]: https://lore.kernel.org/lkml/CA+55aFzqj8wxXnHAdUTiOomipgFONVbqKMjL_tfk7e5ar1FziQ@mail.gmail.com/
[3]: https://www.spinics.net/lists/kvm/msg170113.html
[4]: https://lore.kernel.org/lkml/20190204181118.12095.38300.stgit@localhost.localdomain/
[5]: https://lore.kernel.org/lkml/20190204201854.2328-1-nitesh@redhat.com/
[6]: https://lore.kernel.org/lkml/20190530215223.13974.22445.stgit@localhost.localdomain/
[7]: https://lore.kernel.org/lkml/20190603170306.49099-1-nitesh@redhat.com/


  reply index

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-22 22:27 Alexander Duyck
2019-10-22 22:27 ` [PATCH v12 1/6] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck
2019-10-22 22:28 ` [PATCH v12 2/6] mm: Use zone and order instead of free area in free_list manipulators Alexander Duyck
2019-10-23  8:26   ` David Hildenbrand
2019-10-23 15:16     ` Alexander Duyck
2019-10-24  9:32       ` David Hildenbrand
2019-10-24 15:19         ` Alexander Duyck
2019-10-22 22:28 ` [PATCH v12 3/6] mm: Introduce Reported pages Alexander Duyck
2019-10-22 23:03   ` Andrew Morton
2019-10-22 23:25     ` Alexander Duyck
2019-10-22 22:28 ` [PATCH v12 4/6] mm: Add device side and notifier for unused page reporting Alexander Duyck
2019-10-22 22:28 ` [PATCH v12 5/6] virtio-balloon: Pull page poisoning config out of free page hinting Alexander Duyck
2019-10-22 22:28 ` [PATCH v12 6/6] virtio-balloon: Add support for providing unused page reports to host Alexander Duyck
2019-10-22 22:29 ` [PATCH v12 QEMU 1/3] virtio-ballon: Implement support for page poison tracking feature Alexander Duyck
2019-10-22 22:29 ` [PATCH v12 QEMU 2/3] virtio-balloon: Add bit to notify guest of unused page reporting Alexander Duyck
2019-10-22 22:29 ` [PATCH v12 QEMU 3/3] virtio-balloon: Provide a interface for " Alexander Duyck
2019-10-22 23:01 ` [PATCH v12 0/6] mm / virtio: Provide support " Andrew Morton
2019-10-22 23:43   ` Alexander Duyck
2019-10-23 11:19     ` Nitesh Narayan Lal
2019-10-23 11:35 ` Nitesh Narayan Lal
2019-10-23 22:24   ` Alexander Duyck [this message]
2019-10-28 14:34 ` Nitesh Narayan Lal
2019-10-28 15:24   ` Alexander Duyck

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=29f43d5796feed0dec8e8bb98b187d9dac03b900.camel@linux.intel.com \
    --to=alexander.h.duyck@linux.intel.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=nitesh@redhat.com \
    --cc=osalvador@suse.de \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=vbabka@suse.cz \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git