From: Alexander Duyck <alexander.h.duyck@linux.intel.com> To: Nitesh Narayan Lal <nitesh@redhat.com>, Alexander Duyck <alexander.duyck@gmail.com>, kvm@vger.kernel.org, mst@redhat.com, linux-kernel@vger.kernel.org, willy@infradead.org, mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, vbabka@suse.cz Cc: yang.zhang.wz@gmail.com, konrad.wilk@oracle.com, david@redhat.com, pagupta@redhat.com, riel@surriel.com, lcapitulino@redhat.com, dave.hansen@intel.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, osalvador@suse.de Subject: Re: [PATCH v12 0/6] mm / virtio: Provide support for unused page reporting Date: Wed, 23 Oct 2019 15:24:41 -0700 Message-ID: <29f43d5796feed0dec8e8bb98b187d9dac03b900.camel@linux.intel.com> (raw) In-Reply-To: <c50e102c-f72e-df8a-714f-a33897ddbb9f@redhat.com> On Wed, 2019-10-23 at 07:35 -0400, Nitesh Narayan Lal wrote: > On 10/22/19 6:27 PM, Alexander Duyck wrote: > > This series provides an asynchronous means of reporting unused guest > > pages to a hypervisor so that the memory associated with those pages can > > be dropped and reused by other processes and/or guests. > > <snip> > > > I think Michal Hocko suggested us to include a brief detail about the background > explaining how we ended up with the current approach and what all things we have > already tried. > That would help someone reviewing the patch-series for the first time to > understand it in a better way. I'm not entirely sure it helps. The problem is that even the "brief" version will probably be pretty long. From what I know the first real public discussion of guest memory overcommit and free page hinting dates back to the 2011 KVM forum and a presentation by Rik van Riel[0]. Before I got started in the code there was already virtio-balloon free page hinting[1]. However it was meant to be an all-at-once reporting of the free pages in the system at a given point in time, and used only for VM migration. All it does is inflate a balloon until it encounters an OOM and then it frees the memory back to the guest. One interesting piece that came out of the work on that patch set was the suggestion by Linus to use an array based incremental approach[2] which is what I based my later implementation on. I believe Nitesh had already been working on his own approach for unused page hinting for some time at that point. Prior to submitting my RFC there was already a v7 that had been submitted by Nitesh back in mid 2018[3]. The solution was an array based approach which appeared to instrument arch_alloc_page and arch_free_page and would prevent allocations while hinting was occurring. The first RFC I had written[4] was a synchronous approach that made use of arch_free_page to make a hypercall that would immediately flag the page as being unused. However a hypercall per page can be expensive and we ideally don't want the guest vCPU potentially being hung up while waiting on the host mmap_sem. At about this time I believe Nitesh's solution[5] was still trying to keep an array of pages that were unused and tracking that via arch_free_page. In the synchronous case it could cause OOM errors, and in the asynchronous approach it had issues with being overrun and not being able to track unused pages. Later I switched to an asynchronous approach[6], originally calling it "bubble hinting". With the asynchronous approach it is necessary to have a way to track what pages have been reported and what haven't. I originally was using the page type to track it as I had a Buddy and a TreatedBuddy, but ultimately that moved to a "Reported" page flag. In addition I pulled the counters and pointers out of the free_area/free_list and instead now have a stand-alone set of pointers and keep the reported statistics in a separate dynamic allocation. Then Nitesh's solution had changed to the bitmap approach[7]. However it has been pointed out that this solution doesn't deal with sparse memory, hotplug, and various other issues. Since then both my approach and Nitesh's approach have been iterating with mostly minor changes. [0]: https://www.linux-kvm.org/images/f/ff/2011-forum-memory-overcommit.pdf [1]: https://lore.kernel.org/lkml/1535333539-32420-1-git-send-email-wei.w.wang@intel.com/ [2]: https://lore.kernel.org/lkml/CA+55aFzqj8wxXnHAdUTiOomipgFONVbqKMjL_tfk7e5ar1FziQ@mail.gmail.com/ [3]: https://www.spinics.net/lists/kvm/msg170113.html [4]: https://lore.kernel.org/lkml/20190204181118.12095.38300.stgit@localhost.localdomain/ [5]: https://lore.kernel.org/lkml/20190204201854.2328-1-nitesh@redhat.com/ [6]: https://lore.kernel.org/lkml/20190530215223.13974.22445.stgit@localhost.localdomain/ [7]: https://lore.kernel.org/lkml/20190603170306.49099-1-nitesh@redhat.com/
next prev parent reply index Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-10-22 22:27 Alexander Duyck 2019-10-22 22:27 ` [PATCH v12 1/6] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck 2019-10-22 22:28 ` [PATCH v12 2/6] mm: Use zone and order instead of free area in free_list manipulators Alexander Duyck 2019-10-23 8:26 ` David Hildenbrand 2019-10-23 15:16 ` Alexander Duyck 2019-10-24 9:32 ` David Hildenbrand 2019-10-24 15:19 ` Alexander Duyck 2019-10-22 22:28 ` [PATCH v12 3/6] mm: Introduce Reported pages Alexander Duyck 2019-10-22 23:03 ` Andrew Morton 2019-10-22 23:25 ` Alexander Duyck 2019-10-22 22:28 ` [PATCH v12 4/6] mm: Add device side and notifier for unused page reporting Alexander Duyck 2019-10-22 22:28 ` [PATCH v12 5/6] virtio-balloon: Pull page poisoning config out of free page hinting Alexander Duyck 2019-10-22 22:28 ` [PATCH v12 6/6] virtio-balloon: Add support for providing unused page reports to host Alexander Duyck 2019-10-22 22:29 ` [PATCH v12 QEMU 1/3] virtio-ballon: Implement support for page poison tracking feature Alexander Duyck 2019-10-22 22:29 ` [PATCH v12 QEMU 2/3] virtio-balloon: Add bit to notify guest of unused page reporting Alexander Duyck 2019-10-22 22:29 ` [PATCH v12 QEMU 3/3] virtio-balloon: Provide a interface for " Alexander Duyck 2019-10-22 23:01 ` [PATCH v12 0/6] mm / virtio: Provide support " Andrew Morton 2019-10-22 23:43 ` Alexander Duyck 2019-10-23 11:19 ` Nitesh Narayan Lal 2019-10-23 11:35 ` Nitesh Narayan Lal 2019-10-23 22:24 ` Alexander Duyck [this message] 2019-10-28 14:34 ` Nitesh Narayan Lal 2019-10-28 15:24 ` Alexander Duyck
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=29f43d5796feed0dec8e8bb98b187d9dac03b900.camel@linux.intel.com \ --to=alexander.h.duyck@linux.intel.com \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=alexander.duyck@gmail.com \ --cc=dan.j.williams@intel.com \ --cc=dave.hansen@intel.com \ --cc=david@redhat.com \ --cc=konrad.wilk@oracle.com \ --cc=kvm@vger.kernel.org \ --cc=lcapitulino@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@techsingularity.net \ --cc=mhocko@kernel.org \ --cc=mst@redhat.com \ --cc=nitesh@redhat.com \ --cc=osalvador@suse.de \ --cc=pagupta@redhat.com \ --cc=pbonzini@redhat.com \ --cc=riel@surriel.com \ --cc=vbabka@suse.cz \ --cc=wei.w.wang@intel.com \ --cc=willy@infradead.org \ --cc=yang.zhang.wz@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ linux-kernel@vger.kernel.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git