All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joshua Otto <jtotto@uwaterloo.ca>
To: Wei Liu <wei.liu2@citrix.com>, xen-devel@lists.xenproject.org
Cc: andrew.cooper3@citrix.com, hjarmstr@uwaterloo.ca,
	ian.jackson@eu.citrix.com, czylin@uwaterloo.ca,
	imhy.yang@gmail.com
Subject: Re: [PATCH RFC 00/20] Add postcopy live migration support
Date: Thu, 30 Mar 2017 00:13:51 -0400	[thread overview]
Message-ID: <20170330041351.GD3038@eagle> (raw)
In-Reply-To: <20170328144102.mg424gy3kw2poilo@citrix.com>

On Tue, Mar 28, 2017 at 03:41:02PM +0100, Wei Liu wrote:
> Hi Harley, Chester and Joshua
> 
> This is really nice work. I took a brief look at all the patches, they
> look really high quality.

Thank you!

> 
> We're currently approaching freeze for a Xen release. We've got a lot on
> our plate. I think maintainers will get to this series at some point.

Understood.  We're currently approaching our final exams so that's probably for
the best :)

> 
> From the look of things some patches can go in because they're general
> useful.
> 
> On Mon, Mar 27, 2017 at 05:06:12AM -0400, Joshua Otto wrote:
> > Hi,
> > 
> > We're a team of three fourth-year undergraduate software engineering students at
> > the University of Waterloo in Canada.  In late 2015 we posted on the list [1] to
> > ask for a project to undertake for our program's capstone design project, and
> > Andrew Cooper pointed us in the direction of the live migration implementation
> > as an area that could use some attention.  We were particularly interested in
> > post-copy live migration (as evaluated by [2] and discussed on the list at [3]),
> > and have been working on an implementation of this on-and-off since then.
> > 
> > We now have a working implementation of this scheme, and are submitting it for
> > comment.  The changes are also available as the 'postcopy' branch of the GitHub
> > repository at [4]
> > 
> > As a brief overview of our approach:
> > - We introduce a mechanism by which libxl can indicate to the libxc stream
> >   helper process that the iterative migration precopy loop should be terminated
> >   and postcopy should begin.
> > - At this point, we suspend the domain, collect the final set of dirty pfns and
> >   write these pfns (and _not_ their contents) into the stream.
> > - At the destination, the xc restore logic registers itself as a pager for the
> >   migrating domain, 'evicts' all of the pfns indicated by the sender as
> >   outstanding, and then resumes the domain at the destination.
> > - As the domain executes, the migration sender continues to push the remaining
> >   oustanding pages to the receiver in the background.  The receiver
> >   monitors both the stream for incoming page data and the paging ring event
> >   channel for page faults triggered by the guest.  Page faults are forwarded on
> >   the back-channel migration stream to the migration sender, which prioritizes
> >   these pages for transmission.
> > 
> > By leveraging the existing paging API, we are able to implement the postcopy
> > scheme without any hypervisor modifications - all of our changes are confined to
> > the userspace toolstack.  However, we inherit from the paging API the
> > requirement that the domains be HVM and that the host have HAP/EPT support.
> > 
> 
> Please consider writing a design document for this feature and stick it
> at the beginning of your series in the future. You can find examples
> under docs/designs.

Absolutely, I'll submit one with v2.

> 
> The restriction is a bit unfortunate, but we shouldn't block useful work
> because it's incomplete. We just need to make sure should someone decide
> to implement similar functionality for PV guest, they should be able to
> do so.
> 
> You might want to check if shadow paging can be used with paging API,
> such that you can widen the requirement to HVM guest support.
> 
> > We haven't yet had the opportunity to perform a quantitative evaluation of the
> > performance trade-offs between the traditional pre-copy and our post-copy
> > strategies, but intend to.  Informally, we've been testing our implementation by
> > migrating a domain running the x86 memtest program (which is obviously a
> > tremendously write-heavy workload), and have observed a substantial reduction in
> > total time required for migration completion (at the expense of a visually
> > obvious 'slowdown' in the execution of the program).  We've also noticed that,
> > when performing a postcopy without any leading precopy iterations, the time
> > required at the destination to 'evict' all of the outstanding pages is
> > substantial - possibly because there is no batching mechanism by which pages can
> > be evicted - so this area in particular might require further attention.
> > 
> 
> Please do post numbers when you have them. For now, please be patient
> and wait for people to comment.

Will do.  As a general question for those following the thread, are there any
application workloads/benchmarks that people would find particularly
interesting?

The experiment that we've planned but haven't had the time to follow through
fully is to mount a ramdisk inside the guest and use Axboe's fio to test all of
the entries in the (read/write mix) x (working set size) x (access pattern)
matrix.

Thank you again for your feedback!

Josh

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-03-30  4:14 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-27  9:06 [PATCH RFC 00/20] Add postcopy live migration support Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 01/20] tools: rename COLO 'postcopy' to 'aftercopy' Joshua Otto
2017-03-28 16:34   ` Wei Liu
2017-04-11  6:19     ` Zhang Chen
2017-03-27  9:06 ` [PATCH RFC 02/20] libxc/xc_sr: parameterise write_record() on fd Joshua Otto
2017-03-28 18:53   ` Andrew Cooper
2017-03-31 14:19   ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 03/20] libxc/xc_sr_restore.c: use write_record() in send_checkpoint_dirty_pfn_list() Joshua Otto
2017-03-28 18:56   ` Andrew Cooper
2017-03-31 14:19   ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 04/20] libxc/xc_sr_save.c: add WRITE_TRIVIAL_RECORD_FN() Joshua Otto
2017-03-28 19:03   ` Andrew Cooper
2017-03-30  4:28     ` Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 05/20] libxc/xc_sr: factor out filter_pages() Joshua Otto
2017-03-28 19:27   ` Andrew Cooper
2017-03-30  4:42     ` Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 06/20] libxc/xc_sr: factor helpers out of handle_page_data() Joshua Otto
2017-03-28 19:52   ` Andrew Cooper
2017-03-30  4:49     ` Joshua Otto
2017-04-12 15:16       ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 07/20] migration: defer precopy policy to libxl Joshua Otto
2017-03-29 18:54   ` Jennifer Herbert
2017-03-30  5:28     ` Joshua Otto
2017-03-29 20:18   ` Andrew Cooper
2017-03-30  5:19     ` Joshua Otto
2017-04-12 15:16       ` Wei Liu
2017-04-18 17:56         ` Ian Jackson
2017-03-27  9:06 ` [PATCH RFC 08/20] libxl/migration: add precopy tuning parameters Joshua Otto
2017-03-29 21:08   ` Andrew Cooper
2017-03-30  6:03     ` Joshua Otto
2017-04-12 15:37       ` Wei Liu
2017-04-27 22:51         ` Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 09/20] libxc/xc_sr_save: introduce save batch types Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 10/20] libxc/xc_sr_save.c: initialise rec.data before free() Joshua Otto
2017-03-28 19:59   ` Andrew Cooper
2017-03-29 17:47     ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 11/20] libxc/migration: correct hvm record ordering specification Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 12/20] libxc/migration: specify postcopy live migration Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 13/20] libxc/migration: add try_read_record() Joshua Otto
2017-04-12 15:16   ` Wei Liu
2017-03-27  9:06 ` [PATCH RFC 14/20] libxc/migration: implement the sender side of postcopy live migration Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 15/20] libxc/migration: implement the receiver " Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 16/20] libxl/libxl_stream_write.c: track callback chains with an explicit phase Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 17/20] libxl/libxl_stream_read.c: " Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 18/20] libxl/migration: implement the sender side of postcopy live migration Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 19/20] libxl/migration: implement the receiver " Joshua Otto
2017-03-27  9:06 ` [PATCH RFC 20/20] tools: expose postcopy live migration support in libxl and xl Joshua Otto
2017-03-28 14:41 ` [PATCH RFC 00/20] Add postcopy live migration support Wei Liu
2017-03-30  4:13   ` Joshua Otto [this message]
2017-03-31 14:19     ` Wei Liu
2017-03-29 22:50 ` Andrew Cooper
2017-03-31  4:51   ` Joshua Otto
2017-04-12 15:38     ` Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170330041351.GD3038@eagle \
    --to=jtotto@uwaterloo.ca \
    --cc=andrew.cooper3@citrix.com \
    --cc=czylin@uwaterloo.ca \
    --cc=hjarmstr@uwaterloo.ca \
    --cc=ian.jackson@eu.citrix.com \
    --cc=imhy.yang@gmail.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.