linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Takuya Yoshikawa <takuya.yoshikawa@gmail.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Avi Kivity <avi@redhat.com>, KVM list <kvm@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [RFC] Next gen kvm api
Date: Sat, 4 Feb 2012 11:08:17 +0900	[thread overview]
Message-ID: <20120204110817.30e1a4c62754f074a8a5064d@gmail.com> (raw)
In-Reply-To: <4F2B41D6.8020603@codemonkey.ws>

Hope to get comments from live migration developers,

Anthony Liguori <anthony@codemonkey.ws> wrote:

> > Guest memory management
> > -----------------------
> > Instead of managing each memory slot individually, a single API will be
> > provided that replaces the entire guest physical memory map atomically.
> > This matches the implementation (using RCU) and plugs holes in the
> > current API, where you lose the dirty log in the window between the last
> > call to KVM_GET_DIRTY_LOG and the call to KVM_SET_USER_MEMORY_REGION
> > that removes the slot.
> >
> > Slot-based dirty logging will be replaced by range-based and work-based
> > dirty logging; that is "what pages are dirty in this range, which may be
> > smaller than a slot" and "don't return more than N pages".
> >
> > We may want to place the log in user memory instead of kernel memory, to
> > reduce pinned memory and increase flexibility.
> 
> Since we really only support 64-bit hosts, what about just pointing the kernel 
> at a address/size pair and rely on userspace to mmap() the range appropriately?
> 

Seems reasonable but the real problem is not how to set up the memory:
the problem is how to set a bit in user-space.

We need two things:
	- introducing set_bit_user()
	- changing mmu_lock from spin_lock to mutex_lock
	  (mark_page_dirty() can be called with mmu_lock held)

The former is straightforward and I sent a patch last year.
The latter needs a fundamental change:  I heard (from Avi) that we can
change mmu_lock to mutex_lock if mmu_notifier becomes preemptible.

So I was planning to restart this work when Peter's
	"mm: Preemptibility"
	http://lkml.org/lkml/2011/4/1/141
gets finished.

But even if we cannot achieve "without pinned memory" we may also want
to make the user-space know how many pages are getting dirty.

For example think about the last step of live migration.  We stop the
guest and send the remaining pages.  For this we do not need to write
protect them any more, just want to know which ones are dirty.

If user-space can read the bitmap, it does not need to do GET_DIRTY_LOG
because the guest is already stopped, so we can reduce the downtime.

Is this correct?


So I think we can do this in two steps:
	1. just move the bitmap to user-space and (pin it)
	2. un-pin it when the time comes

I can start 1 after "srcu-less dirty logging" gets finished.


	Takuya

  reply	other threads:[~2012-02-04  2:08 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-02 16:09 [RFC] Next gen kvm api Avi Kivity
     [not found] ` <CAB9FdM9M2DWXBxxyG-ez_5igT61x5b7ptw+fKfgaqMBU_JS5aA@mail.gmail.com>
2012-02-02 22:16   ` [Qemu-devel] " Rob Earhart
2012-02-05 13:14   ` Avi Kivity
2012-02-06 17:41     ` Rob Earhart
2012-02-06 19:11       ` Anthony Liguori
2012-02-07 12:03         ` Avi Kivity
2012-02-07 15:17           ` Anthony Liguori
2012-02-07 16:02             ` Avi Kivity
2012-02-07 16:18               ` Jan Kiszka
2012-02-07 16:21                 ` Anthony Liguori
2012-02-07 16:29                   ` Jan Kiszka
2012-02-15 13:41                     ` Avi Kivity
2012-02-07 16:19               ` Anthony Liguori
2012-02-15 13:47                 ` Avi Kivity
2012-02-07 12:01       ` Avi Kivity
2012-02-03  2:09 ` Anthony Liguori
2012-02-04  2:08   ` Takuya Yoshikawa [this message]
2012-02-22 13:06     ` Peter Zijlstra
2012-02-05  9:24   ` Avi Kivity
2012-02-07  1:08   ` Alexander Graf
2012-02-07 12:24     ` Avi Kivity
2012-02-07 12:51       ` Alexander Graf
2012-02-07 13:16         ` Avi Kivity
2012-02-07 13:40           ` Alexander Graf
2012-02-07 14:21             ` Avi Kivity
2012-02-07 14:39               ` Alexander Graf
2012-02-15 11:18                 ` Avi Kivity
2012-02-15 11:57                   ` Alexander Graf
2012-02-15 13:29                     ` Avi Kivity
2012-02-15 13:37                       ` Alexander Graf
2012-02-15 13:57                         ` Avi Kivity
2012-02-15 14:08                           ` Alexander Graf
2012-02-16 19:24                             ` Avi Kivity
2012-02-16 19:34                               ` Alexander Graf
2012-02-16 19:38                                 ` Avi Kivity
2012-02-16 20:41                                   ` Scott Wood
2012-02-17  0:23                                     ` Alexander Graf
2012-02-17 18:27                                       ` Scott Wood
2012-02-18  9:49                                     ` Avi Kivity
2012-02-17  0:19                                   ` Alexander Graf
2012-02-18 10:00                                     ` Avi Kivity
2012-02-18 10:43                                       ` Alexander Graf
2012-02-15 19:17                     ` Scott Wood
2012-02-12  7:10               ` Takuya Yoshikawa
2012-02-15 13:32                 ` Avi Kivity
2012-02-07 15:23             ` Anthony Liguori
2012-02-07 15:28               ` Alexander Graf
2012-02-08 17:20               ` Alan Cox
2012-02-15 13:33               ` Avi Kivity
2012-02-15 22:14             ` Arnd Bergmann
2012-02-10  3:07   ` Jamie Lokier
2012-02-03 18:07 ` Eric Northup
2012-02-03 22:52   ` [Qemu-devel] " Anthony Liguori
2012-02-06 19:46     ` Scott Wood
2012-02-07  6:58       ` Michael Ellerman
2012-02-07 10:04         ` Alexander Graf
2012-02-15 22:21           ` Arnd Bergmann
2012-02-16  1:04             ` Michael Ellerman
2012-02-16 19:28               ` Avi Kivity
2012-02-17  0:09                 ` Michael Ellerman
2012-02-18 10:03                   ` Avi Kivity
2012-02-16 10:26             ` Avi Kivity
2012-02-07 12:28       ` Anthony Liguori
2012-02-07 12:40         ` Avi Kivity
2012-02-07 12:51           ` Anthony Liguori
2012-02-07 13:18             ` Avi Kivity
2012-02-07 15:15               ` Anthony Liguori
2012-02-07 18:28                 ` Chris Wright
2012-02-08 17:02         ` Scott Wood
2012-02-08 17:12           ` Alan Cox
2012-02-05  9:37 ` Gleb Natapov
2012-02-05  9:44   ` Avi Kivity
2012-02-05  9:51     ` Gleb Natapov
2012-02-05  9:56       ` Avi Kivity
2012-02-05 10:58         ` Gleb Natapov
2012-02-05 13:16           ` Avi Kivity
2012-02-05 16:36       ` [Qemu-devel] " Anthony Liguori
2012-02-06  9:34         ` Avi Kivity
2012-02-06 13:33           ` Anthony Liguori
2012-02-06 13:54             ` Avi Kivity
2012-02-06 14:00               ` Anthony Liguori
2012-02-06 14:08                 ` Avi Kivity
2012-02-07 18:12           ` Rusty Russell
2012-02-15 13:39             ` Avi Kivity
2012-02-15 21:59               ` Anthony Liguori
2012-02-16  8:57                 ` Gleb Natapov
2012-02-16 14:46                   ` Anthony Liguori
2012-02-16 19:34                     ` Avi Kivity
2012-02-15 23:08               ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120204110817.30e1a4c62754f074a8a5064d@gmail.com \
    --to=takuya.yoshikawa@gmail.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).