LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, rkrcmar@redhat.com, x86@kernel.org,
	mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
	pbonzini@redhat.com, tglx@linutronix.de,
	akpm@linux-foundation.org
Subject: Re: [RFC PATCH 3/4] kvm: Add guest side support for free memory hints
Date: Mon, 11 Feb 2019 12:36:13 -0500
Message-ID: <20190211122321-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <869a170e9232ffbc8ddbcf3d15535e8c6daedbde.camel@linux.intel.com>

On Mon, Feb 11, 2019 at 08:31:34AM -0800, Alexander Duyck wrote:
> On Sat, 2019-02-09 at 19:49 -0500, Michael S. Tsirkin wrote:
> > On Mon, Feb 04, 2019 at 10:15:52AM -0800, Alexander Duyck wrote:
> > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > > 
> > > Add guest support for providing free memory hints to the KVM hypervisor for
> > > freed pages huge TLB size or larger. I am restricting the size to
> > > huge TLB order and larger because the hypercalls are too expensive to be
> > > performing one per 4K page.
> > 
> > Even 2M pages start to get expensive with a TB guest.
> 
> Agreed.
> 
> > Really it seems we want a virtio ring so we can pass a batch of these.
> > E.g. 256 entries, 2M each - that's more like it.
> 
> The only issue I see with doing that is that we then have to defer the
> freeing. Doing that is going to introduce issues in the guest as we are
> going to have pages going unused for some period of time while we wait
> for the hint to complete, and we cannot just pull said pages back. I'm
> not really a fan of the asynchronous nature of Nitesh's patches for
> this reason.

Well nothing prevents us from doing an extra exit to the hypervisor if
we want. The asynchronous nature is there as an optimization
to allow hypervisor to do its thing on a separate CPU.
Why not proceed doing other things meanwhile?
And if the reason is that we are short on memory, then
maybe we should be less aggressive in hinting?

E.g. if we just have 2 pages:

hint page 1
page 1 hint processed?
	yes - proceed to page 2
	no - wait for interrupt

get interrupt that page 1 hint is processed
hint page 2


If hypervisor happens to be running on same CPU it
can process things synchronously and we never enter
the no branch.





> > > Using the huge TLB order became the obvious
> > > choice for the order to use as it allows us to avoid fragmentation of higher
> > > order memory on the host.
> > > 
> > > I have limited the functionality so that it doesn't work when page
> > > poisoning is enabled. I did this because a write to the page after doing an
> > > MADV_DONTNEED would effectively negate the hint, so it would be wasting
> > > cycles to do so.
> > 
> > Again that's leaking host implementation detail into guest interface.
> > 
> > We are giving guest page hints to host that makes sense,
> > weird interactions with other features due to host
> > implementation details should be handled by host.
> 
> I don't view this as a host implementation detail, this is guest
> feature making use of all pages for debugging. If we are placing poison
> values in the page then I wouldn't consider them an unused page, it is
> being actively used to store the poison value.

Well I guess it's a valid point of view for a kernel hacker, but they are
unused from application's point of view.
However poisoning is transparent to users and most distro users
are not aware of it going on. They just know that debug kernels
are slower.
User loading a debug kernel and immediately breaking overcommit
is an unpleasant experience.

> If we can achieve this
> and free the page back to the host then even better, but until the
> features can coexist we should not use the page hinting while page
> poisoning is enabled.

Existing hinting in balloon allows them to coexist so I think we
need to set the bar just as high for any new variant.

> This is one of the reasons why I was opposed to just disabling page
> poisoning when this feature was enabled in Nitesh's patches. If the
> guest has page poisoning enabled it is doing something with the page.
> It shouldn't be prevented from doing that because the host wants to
> have the option to free the pages.

I agree but I think the decision belongs on the host. I.e.
hint the page but tell the host it needs to be careful
about the poison value. It might also mean we
need to make sure poisoning happens after the hinting, not before.

-- 
MST

  reply index

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-04 18:15 [RFC PATCH 0/4] kvm: Report unused guest pages to host Alexander Duyck
2019-02-04 18:15 ` [RFC PATCH 1/4] madvise: Expose ability to set dontneed from kernel Alexander Duyck
2019-02-04 18:15 ` [RFC PATCH 2/4] kvm: Add host side support for free memory hints Alexander Duyck
2019-02-10  0:44   ` Michael S. Tsirkin
2019-02-11 17:34     ` Alexander Duyck
2019-02-11 17:36       ` Michael S. Tsirkin
2019-02-11 17:41     ` Dave Hansen
2019-02-11 17:48       ` Michael S. Tsirkin
2019-02-11 18:30         ` Alexander Duyck
2019-02-11 19:24           ` Michael S. Tsirkin
2019-02-04 18:15 ` [RFC PATCH 3/4] kvm: Add guest " Alexander Duyck
2019-02-04 19:44   ` Dave Hansen
2019-02-04 20:42     ` Alexander Duyck
2019-02-04 23:00   ` Nadav Amit
2019-02-04 23:37     ` Alexander Duyck
2019-02-05  0:03       ` Nadav Amit
2019-02-05  0:16         ` Alexander Duyck
2019-02-05  1:46           ` Nadav Amit
2019-02-05 18:09             ` Alexander Duyck
2019-02-07 18:21   ` Luiz Capitulino
2019-02-07 18:44     ` Alexander Duyck
2019-02-07 20:02       ` Luiz Capitulino
2019-02-08 21:05       ` Nitesh Narayan Lal
2019-02-08 21:31         ` Alexander Duyck
2019-02-10  0:49   ` Michael S. Tsirkin
2019-02-11 16:31     ` Alexander Duyck
2019-02-11 17:36       ` Michael S. Tsirkin [this message]
2019-02-11 18:10         ` Alexander Duyck
2019-02-11 19:54           ` Michael S. Tsirkin
2019-02-11 21:00             ` Alexander Duyck
2019-02-11 22:52               ` Michael S. Tsirkin
     [not found]                 ` <94462313ccd927d25675f69de459456cf066c1a2.camel@linux.intel.com>
2019-02-12  0:34                   ` Michael S. Tsirkin
2019-02-11 17:48     ` Dave Hansen
2019-02-11 17:58       ` Michael S. Tsirkin
2019-02-11 18:19         ` Dave Hansen
2019-02-11 19:56           ` Michael S. Tsirkin
2019-02-04 18:15 ` [RFC PATCH 4/4] mm: Add merge page notifier Alexander Duyck
2019-02-04 19:40   ` Dave Hansen
2019-02-04 19:51     ` Alexander Duyck
2019-02-10  0:57   ` Michael S. Tsirkin
2019-02-11 13:30     ` Nitesh Narayan Lal
2019-02-11 14:17       ` Michael S. Tsirkin
2019-02-11 16:24         ` Nitesh Narayan Lal
2019-02-11 17:41           ` Michael S. Tsirkin
2019-02-11 18:09             ` Nitesh Narayan Lal
2019-02-11  6:40   ` Aaron Lu
2019-02-11 15:58     ` Alexander Duyck
2019-02-12  2:09       ` Aaron Lu
2019-02-12 17:20         ` Alexander Duyck
2019-02-04 18:19 ` [RFC PATCH QEMU] i386/kvm: Enable paravirtual unused page hint mechanism Alexander Duyck
2019-02-05 17:25 ` [RFC PATCH 0/4] kvm: Report unused guest pages to host Nitesh Narayan Lal
2019-02-05 18:43   ` Alexander Duyck
2019-02-07 14:48 ` Nitesh Narayan Lal
2019-02-07 16:56   ` Alexander Duyck
2019-02-10  0:51 ` Michael S. Tsirkin

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190211122321-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox