KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Nitesh Narayan Lal <nitesh@redhat.com>,
	kvm list <kvm@vger.kernel.org>,
	David Hildenbrand <david@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yang Zhang <yang.zhang.wz@gmail.com>,
	pagupta@redhat.com, Rik van Riel <riel@surriel.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	lcapitulino@redhat.com, wei.w.wang@intel.com,
	Andrea Arcangeli <aarcange@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	dan.j.williams@intel.com,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>
Subject: Re: [PATCH v1 6/6] virtio-balloon: Add support for aerating memory via hinting
Date: Thu, 18 Jul 2019 13:29:14 -0700
Message-ID: <CAKgT0UeRy2eHKnz4CorefBAG8ro+3h4oFX+z1JY2qRm17fcV8w@mail.gmail.com> (raw)
In-Reply-To: <20190718113548-mutt-send-email-mst@kernel.org>

On Thu, Jul 18, 2019 at 9:07 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jul 18, 2019 at 08:34:37AM -0700, Alexander Duyck wrote:
> > On Wed, Jul 17, 2019 at 10:14 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jul 17, 2019 at 09:43:52AM -0700, Alexander Duyck wrote:
> > > > On Wed, Jul 17, 2019 at 3:28 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jul 16, 2019 at 02:06:59PM -0700, Alexander Duyck wrote:
> > > > > > On Tue, Jul 16, 2019 at 10:41 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > <snip>
> > > > > >
> > > > > > > > > This is what I am saying. Having watched that patchset being developed,
> > > > > > > > > I think that's simply because processing blocks required mm core
> > > > > > > > > changes, which Wei was not up to pushing through.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > If we did
> > > > > > > > >
> > > > > > > > >         while (1) {
> > > > > > > > >                 alloc_pages
> > > > > > > > >                 add_buf
> > > > > > > > >                 get_buf
> > > > > > > > >                 free_pages
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > > We'd end up passing the same page to balloon again and again.
> > > > > > > > >
> > > > > > > > > So we end up reserving lots of memory with alloc_pages instead.
> > > > > > > > >
> > > > > > > > > What I am saying is that now that you are developing
> > > > > > > > > infrastructure to iterate over free pages,
> > > > > > > > > FREE_PAGE_HINT should be able to use it too.
> > > > > > > > > Whether that's possible might be a good indication of
> > > > > > > > > whether the new mm APIs make sense.
> > > > > > > >
> > > > > > > > The problem is the infrastructure as implemented isn't designed to do
> > > > > > > > that. I am pretty certain this interface will have issues with being
> > > > > > > > given small blocks to process at a time.
> > > > > > > >
> > > > > > > > Basically the design for the FREE_PAGE_HINT feature doesn't really
> > > > > > > > have the concept of doing things a bit at a time. It is either
> > > > > > > > filling, stopped, or done. From what I can tell it requires a
> > > > > > > > configuration change for the virtio balloon interface to toggle
> > > > > > > > between those states.
> > > > > > >
> > > > > > > Maybe I misunderstand what you are saying.
> > > > > > >
> > > > > > > Filling state can definitely report things
> > > > > > > a bit at a time. It does not assume that
> > > > > > > all of guest free memory can fit in a VQ.
> > > > > >
> > > > > > I think where you and I may differ is that you are okay with just
> > > > > > pulling pages until you hit OOM, or allocation failures. Do I have
> > > > > > that right?
> > > > >
> > > > > This is exactly what the current code does. But that's an implementation
> > > > > detail which came about because we failed to find any other way to
> > > > > iterate over free blocks.
> > > >
> > > > I get that. However my concern is that permeated other areas of the
> > > > implementation that make taking another approach much more difficult
> > > > than it needs to be.
> > >
> > > Implementation would have to change to use an iterator obviously. But I don't see
> > > that it leaked out to a hypervisor interface.
> > >
> > > In fact take a look at virtio_balloon_shrinker_scan
> > > and you will see that it calls shrink_free_pages
> > > without waiting for the device at all.
> >
> > Yes, and in case you missed it earlier I am pretty sure that leads to
> > possible memory corruption. I don't think it was tested enough to be
> > able to say that is safe.
>
> More testing would be good, for sure.
>
> > Specifically we cannot be clearing the dirty flag on pages that are in
> > use. We should only be clearing that flag for pages that are
> > guaranteed to not be in use.
>
> I think that clearing the dirty flag is safe if the flag was originally
> set and the page has been
> write-protected before reporting was requested.
> In that case we know that page has not been changed.
> Right?

I am just going to drop the rest of this thread as I agree we have
been running ourselves around in circles. The part I had missed was
the part where there are 2 bitmaps and that you are are using
migration_bitmap_sync_precopy() to align the two.

This is just running at the same time as the precopy code and is only
really meant to try and clear the bit before the precopy gets to it
from what I can tell.

So one thing that is still an issue then is that my approach would
only work on the first migration. The problem is the logic I have
implemented assumes that once we have hinted on a page we don't need
to do it again. However in order to support migration you would need
to reset the hinting entirely and start over again after doing a
migration.

  reply index

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-19 22:32 [PATCH v1 0/6] mm / virtio: Provide support for paravirtual waste page treatment Alexander Duyck
2019-06-19 22:33 ` [PATCH v1 1/6] mm: Adjust shuffle code to allow for future coalescing Alexander Duyck
2019-06-25  7:55   ` David Hildenbrand
2019-06-28 19:49     ` Alexander Duyck
2019-06-25 18:25   ` Dave Hansen
2019-06-25 18:26   ` Dave Hansen
2019-06-19 22:33 ` [PATCH v1 2/6] mm: Move set/get_pcppage_migratetype to mmzone.h Alexander Duyck
2019-06-25 18:28   ` Dave Hansen
2019-06-28 19:55     ` Alexander Duyck
2019-06-19 22:33 ` [PATCH v1 3/6] mm: Use zone and order instead of free area in free_list manipulators Alexander Duyck
2019-06-25 18:36   ` Dave Hansen
2019-06-19 22:33 ` [PATCH v1 4/6] mm: Introduce "aerated" pages Alexander Duyck
2019-06-25 19:45   ` Dave Hansen
2019-07-08 17:32     ` Alexander Duyck
2019-06-19 22:33 ` [PATCH v1 5/6] mm: Add logic for separating "aerated" pages from "raw" pages Alexander Duyck
2019-06-25 20:24   ` Dave Hansen
2019-07-08 19:02     ` Alexander Duyck
2019-07-08 19:36       ` Dave Hansen
2019-07-08 22:02         ` Alexander Duyck
2019-06-19 22:33 ` [PATCH v1 6/6] virtio-balloon: Add support for aerating memory via hinting Alexander Duyck
2019-07-16  9:55   ` Michael S. Tsirkin
2019-07-16 14:00     ` Dave Hansen
2019-07-16 14:12       ` David Hildenbrand
2019-07-16 14:17         ` David Hildenbrand
2019-07-16 15:04           ` Michael S. Tsirkin
2019-07-16 14:41         ` Dave Hansen
2019-07-16 15:01           ` Wang, Wei W
2019-07-16 16:12             ` Michael S. Tsirkin
2019-07-16 15:02           ` David Hildenbrand
2019-07-16 15:37     ` Alexander Duyck
2019-07-16 16:07       ` Michael S. Tsirkin
2019-07-16 16:54         ` Alexander Duyck
2019-07-16 17:41           ` Michael S. Tsirkin
2019-07-16 21:06             ` Alexander Duyck
2019-07-17 10:28               ` Michael S. Tsirkin
2019-07-17 16:43                 ` Alexander Duyck
2019-07-18  5:13                   ` Michael S. Tsirkin
2019-07-18 15:34                     ` Alexander Duyck
2019-07-18 16:03                       ` Nitesh Narayan Lal
2019-07-18 20:27                         ` Michael S. Tsirkin
2019-07-18 16:07                       ` Michael S. Tsirkin
2019-07-18 20:29                         ` Alexander Duyck [this message]
2019-07-18 20:37                           ` Michael S. Tsirkin
2019-07-18 20:54                             ` Alexander Duyck
2019-07-18 20:24                       ` Michael S. Tsirkin
2019-07-18 20:34                         ` Alexander Duyck
2019-07-18 20:48                           ` Michael S. Tsirkin
2019-07-18 21:09                             ` Alexander Duyck
2019-06-19 22:37 ` [PATCH v1 QEMU] QEMU: Provide a interface for hinting based off of the balloon infrastructure Alexander Duyck
2019-06-25  7:42 ` [PATCH v1 0/6] mm / virtio: Provide support for paravirtual waste page treatment David Hildenbrand
2019-06-25 14:10   ` Dave Hansen
2019-06-25 17:00     ` Alexander Duyck
2019-06-25 18:12       ` David Hildenbrand
2019-06-25 18:22       ` Dave Hansen
2019-07-15  9:41         ` David Hildenbrand
2019-07-15 14:57           ` Alexander Duyck
2019-06-25 16:09   ` Alexander Duyck
2019-06-26  9:01   ` Christophe de Dinechin
2019-06-26  9:12     ` David Hildenbrand

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKgT0UeRy2eHKnz4CorefBAG8ro+3h4oFX+z1JY2qRm17fcV8w@mail.gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=nitesh@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=wei.w.wang@intel.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org kvm@archiver.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox